I'd like to serve Tensorfow Model by using OpenFaaS. Basically, I'd like to invoke the "serve" function in such a way that tensorflow serving
is going to expose my model.
OpenFaaS is running correctly on Kubernetes and I am able to invoke functions via curl
or from the UI
.
I used the incubator-flask as example, but I keep receiving 502 Bad Gateway
all the time.
The OpenFaaS project looks like the following
serve/
- Dockerfile
stack.yaml
The inner Dockerfile
is the following
FROM tensorflow/serving
RUN mkdir -p /home/app
RUN apt-get update \
&& apt-get install curl -yy
RUN echo "Pulling watchdog binary from Github." \
&& curl -sSLf https://github.com/openfaas-incubator/of-watchdog/releases/download/0.4.6/of-watchdog > /usr/bin/fwatchdog \
&& chmod +x /usr/bin/fwatchdog
WORKDIR /root/
# remove unecessery logs from S3
ENV TF_CPP_MIN_LOG_LEVEL=3
ENV AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
ENV AWS_REGION=${AWS_REGION}
ENV S3_ENDPOINT=${S3_ENDPOINT}
ENV fprocess="tensorflow_model_server --rest_api_port=8501 \
--model_name=${MODEL_NAME} \
--model_base_path=${MODEL_BASE_PATH}"
# Set to true to see request in function logs
ENV write_debug="true"
ENV cgi_headers="true"
ENV mode="http"
ENV upstream_url="http://127.0.0.1:8501"
# gRPC tensorflow serving
# EXPOSE 8500
# REST tensorflow serving
# EXPOSE 8501
RUN touch /tmp/.lock
HEALTHCHECK --interval=5s CMD [ -e /tmp/.lock ] || exit 1
CMD [ "fwatchdog" ]
the stack.yaml
file looks like the following
provider:
name: faas
gateway: https://gateway-url:8080
functions:
serve:
lang: dockerfile
handler: ./serve
image: repo/serve-model:latest
imagePullPolicy: always
I build the image with faas-cli build -f stack.yaml
and then I push it to my docker registry with faas-cli push -f stack.yaml
.
When I execute faas-cli deploy -f stack.yaml -e AWS_ACCESS_KEY_ID=...
I get Accepted 202
and it appears correctly among my functions. Now, I want to invoke the tensorflow serving
on the model I specified in my ENV
.
The way I try to make it work is to use curl
in this way
curl -d '{"inputs": ["1.0", "2.0", "5.0"]}' -X POST https://gateway-url:8080/function/deploy-model/v1/models/mnist:predict
but I always obtain 502 Bad Gateway
.
Does anybody have experience with OpenFaaS and Tensorflow Serving? Thanks in advance
P.S.
If I run tensorflow serving
without of-watchdog
(basically without the openfaas stuff), the model is served correctly.
Elaborating the link mentioned by @viveksyngh.
tensorflow-serving-openfaas:
Example of packaging TensorFlow Serving with OpenFaaS to be deployed and managed through OpenFaaS with auto-scaling, scale-from-zero and a sane configuration for Kubernetes.
This example was adapted from: https://www.tensorflow.org/serving
Pre-reqs:
OpenFaaS
OpenFaaS CLI
Docker
Instructions:
Clone the repo
$ mkdir -p ~/dev/
$ cd ~/dev/
$ git clone https://github.com/alexellis/tensorflow-serving-openfaas
Clone the sample model and copy it to the function's build context
$ cd ~/dev/tensorflow-serving-openfaas
$ git clone https://github.com/tensorflow/serving
$ cp -r serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu ./ts-serve/saved_model_half_plus_two_cpu
Edit the Docker Hub username
You need to edit the stack.yml file and replace alexellis2 with your Docker Hub account.
Build the function image
$ faas-cli build
You should now have a Docker image in your local library which you can deploy to a cluster with faas-cli up
Test the function locally
All OpenFaaS images can be run stand-alone without OpenFaaS installed, let's do a quick test, but replace alexellis2 with your own name.
$ docker run -p 8081:8080 -ti alexellis2/ts-serve:latest
Now in another terminal:
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://127.0.0.1:8081/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}
From here you can run faas-cli up and then invoke your function from the OpenFaaS UI, CLI or REST API.
$ export OPENFAAS_URL=http://127.0.0.1:8080
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' $OPENFAAS_URL/function/ts-serve/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}