Flink cluster on EKS

1/8/2020

I am new to Flink and kubernetes. I am planning to creating a flink streaming job that streams data from a FileSystem to Kafka.

Have the flink job jar which is working fine(tested locally). Now I am trying to host this job in kubernetes, and would like to use EKS in AWS.

I have read through official flink documentation on how to setup flink cluster. https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/deployment/kubernetes.html

I tried to set it up locally using minikube and brought up session cluster and submitted the job which is working fine.

My questions: 1)Out of the two options Job cluster and session cluster, since the job is streaming job and should keep monitor the filesystem and when any new files came in it should stream it to destination, can I use job cluster in this case? As per documentation job cluster is something that executes the job and terminates once it is completed, if the job has monitor on a folder does it ever complete?

2)I have a maven project that builds the flink jar, would like to know the ideal way to spin a session/job cluster using this jar in production ? what is the normal CI CD process ? Shall I build a session cluster initially and submit the jobs whenever needed ? or spinning up Job cluster with the jar built ?

-- VSK
amazon-web-services
apache-flink
eks
java
kubernetes

1 Answer

1/9/2020

First off, the link that you provided is for Flink 1.5. If you are starting fresh, I'd recommend using Flink 1.9 or the upcoming 1.10.

For your questions:

1) A job with file monitor never terminates. It cannot know when no more files arrive, so you have to cancel it manually. Job cluster is fine for that.

2) There is no clear answer to that and it's also not Flink specific. Everyone has a different solution with different drawbacks.

I'd aim for a semi-automatic approach, where everything is automatic but you need to explicitly press a deploy button (and not just a git push). Often times, these CI/CD pipelines deploy on a test cluster first and make a smoke test before allowing a deploy on production.

If you are completely fresh, you could check the AWS codedeploy. However, I made good experiences with Gitlab and AWS runner.

The normal process would be something like:

  • build
  • integration/e2e tests on build machine (dockerized)
  • deploy on test cluster/preprod cluster
  • run smoke tests
  • deploy on prod

I have also seen processes that go quickly on prod and invest the time in better monitoring and a fast rollback instead of preprod cluster and smoke tests. That's usually viable for business uncritical processes and how expensive a reprocessing is.

-- Arvid Heise
Source: StackOverflow