I have an Function-as-a-Service (Faas) application running on a aws cluster managed by kubernetes. We can run simple function like squre(x) x*x i. It has a front-end interface to the Function service as shown below. The FaaS stores the intermediate values on a Key-value store (KVS) at the backend. These are the function and memory pod that can be accessed from the service ELB URL as shown below.
But when I update the image of daemonset pod, then my client application is getting the errors shown below. Looks like it is not being able to retrieve the value from the KVS using the method .get()
and got stuck after executing this line of client code.
Can anyone point me why this could be happening.. If I don't update the image, the function returns normally as expected value 2 (incr function increments the value by 1). How to update a container image without messing up an existing application?
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/function-nodes-bbvnn 4/4 Running 0 17m
pod/management-pod 1/1 Running 0 87m
pod/memory-nodes-cbmjw 1/1 Running 0 80m
pod/monitoring-pod 1/1 Running 0 86m
pod/routing-nodes-vlm8w 1/1 Running 0 84m
pod/scheduler-nodes-8n8qw 1/1 Running 0 77m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/function-service LoadBalancer 100.70.84.205 a3017369b648a4f2fbe0febfc0ad54c9-2011295479.us-east-1.elb.amazonaws.com 5000:31561/TCP,5001:30768/TCP,5002:31216/TCP,5003:32210/TCP,5004:30144/TCP,5005:30786/TCP,5006:32347/TCP 72m
service/kubernetes ClusterIP 100.64.0.1 <none> 443/TCP 91m
service/routing-service LoadBalancer 100.71.3.82 a39920633763c4715b2e206f79a1c12f-785670853.us-east-1.elb.amazonaws.com 6450:31504/TCP,6451:31163/TCP,6452:30488/TCP,6453:31662/TCP 78m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/function-nodes 1 1 1 1 1 role=function 73m
daemonset.apps/memory-nodes 1 1 1 1 1 role=memory 80m
daemonset.apps/routing-nodes 1 1 1 1 1 role=routing 84m
daemonset.apps/scheduler-nodes 1 1 1 1 1 role=scheduler 77m
>>> from cloudburst.client.client import CloudburstConnection
>>> cloudburst = CloudburstConnection('a3017369b648a4f2fbe0febfc0ad54c9-2011295479.us-east-1.elb.amazonaws.com', '172.20.43.111', False)
>>> incr = lambda _, a: a + 1
>>> cloud_incr = cloudburst.register(incr, 'incr')
>>> cloud_incr(1).get()
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/hydro/cloudburst/cloudburst/shared/future.py", line 26, in get
obj = self.kvs_client.get(self.obj_id)[self.obj_id]
File "/usr/local/lib/python3.6/dist-packages/anna/client.py", line 106, in get
KeyResponse)
File "/usr/local/lib/python3.6/dist-packages/anna/zmq_util.py", line 27, in recv_response
resp = rcv_sock.recv()
File "zmq/backend/cython/socket.pyx", line 788, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 824, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 186, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/checkrc.pxd", line 12, in zmq.backend.cython.checkrc._check_rc
KeyboardInterrupt
>>>