Apache Camel with Kubernetes

9/21/2018

We are working on migrating our application into kubernetes. We use apache camel to process very large files. But the problem is that when we migrate to kubernetes, there will be more than one pod and there is high chance that pods often restart. If the restart happens in the middle of the processing of a huge file,

  1. Will there be any chance of camellock not getting released from the file?
  2. Will the restart make the file to be processed from first, rather than being started from where the process has left?
-- karthik kannan B
apache-camel
kubernetes

1 Answer

9/21/2018

I acknowledge that there are a lot of situations that can lead to a pod restart in a Kubernetes cluster, but the problems you describe applies to camel applications running outside Kubernetes as well.

To answering your questions, in case of an application restart in the middle of the processing of a large file, according to file component documentation https://github.com/apache/camel/blob/master/camel-core/src/main/docs/file-component.adoc:

  1. eventually the camel lock will be released, timing depends on which way you provide the file to the pod.
  2. if the processing has not correctly finished and the file has not been modified during the processing, then the processing will restart from the file beginning.

That said there are strategies and EIP that can be applied and used like splitter http://camel.apache.org/splitter.html and idempotent consumers http://camel.apache.org/idempotent-consumer.html to split the file in chunks and avoid reprocessing same chunks.

-- Valdar
Source: StackOverflow