I have a Python controller which uses scrapy-splash
lib that sends SplashRequest
to a Splash service.
Locally, I run both, the controller and the splash service in a two different Dockers.
yield SplashRequest(url=response.url, callback=parse, splash_url=<URL> endpoint='execute', args=<SPLASH_ARGS>)
When I send the request locally with splash_url="http://127.0.0.1:8050
, everything works fine.
Now, I wanted to have a Kubernetes deployment with Splash and to process the splash request on the cloud. I have created Splash Deployment and a Service with type=LoadBalancer
on Google Cloud Kubernetes.
And sending the splash request to the External Ip
of the splash service.
But splash doesn't receive any request... and in the python script I get
twisted.python.failure.Failure twisted.internet.error.TCPTimedOutError: TCP connection timed out: 60: Operation timed out.
It worked in the past while using Internal endpoint
of the pod, but I started to get Missing schema
exception cause I didn't used http://
in the url.
splash-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: my-app
name: splash
namespace: ns-app
spec:
replicas: 1
strategy: {}
template:
metadata:
labels:
app: splash
spec:
containers:
- image: scrapinghub/splash:3.2
name: splash
ports:
- containerPort: 8050
resources: {}
restartPolicy: Always
status: {}
splash-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: app
name: splash
namespace: ns-app
spec:
type: LoadBalancer
ports:
- name: "8050"
port: 8050
targetPort: 8050
protocol: TCP
selector:
app: app
status:
loadBalancer: {}
UPDATE I noticed that locally when I get into http://localhost:8050/
I see Splash UI, while entering to the via Kubernetes IP I get
refused to connect
How to solve it?? Thank you
The problem is that splash-service.yaml
selector is wrong.. it should point to the Deployment name.
apiVersion: v1
kind: Service
metadata:
labels:
app: app
name: splash
namespace: ns-app
spec:
type: LoadBalancer
ports:
- name: "8050"
port: 8050
targetPort: 8050
protocol: TCP
selector:
app: splash
status:
loadBalancer: {}
UPDATE I noticed now that you found alone the issue, my bad.
I believe that as Ami Hollander is right, it is an issue with the label selector, but I would like to explain you why.
Consider that each time you create a service with a selector, an endpoint resource is created as well, it is populated with all the address of the nodes having a pod matching the label, you can add as well manually any IP or Domain to point to external resources.
Kubernetes services can be exposed on externalIPs that routes to one or more cluster nodes. Traffic that ingresses into the cluster with the external IP (as destination IP), on the service port, will be routed to one of the service endpoints.
Therefore, as they pointed you out, your selector was not matching any pod and the endpoint resource likely does not contain any backend and so any way route the request. You can double check it running:
$ kubectl get endpoints
$ Kubectl describe endpoints endpointname
It can be misleading because on the other hand if you run
$ kubectl get services
you will notice that the service has been correctly created showing a private and a public IP that will be simply a dead end.