What is the current 'state of technology' when having a pipeline composed of python code and Tensorflow/Keras models?
We are trying to have scalability and reactive design using dask and Streamz (for servers registered using Kubernetes). But currently, we do not know what is the right way to design such infrastructure concerning the fact, that we do want our Tensorflow models to persist and not to be repeatedly created and deleted.
Is Tensorflow serving the technology to be used for this task?
(I was able to find only the basic examples like Persistent dataflows with dask and http://matthewrocklin.com/blog/work/2017/02/11/dask-tensorflow)