i have two separate containers having docker images where one is running the REST application and the other one is running the process for downloading satellite images. My aim is that when i click on the download button after passing the query with defined parameters in my main application, it should start the container for download and once downloaded, it should stop the container.Currently i am able to run the container for download independently by providing all the necessary environment variables for it's docker image in it's deployment file but in the long run these variables should be coming as the parameters from the query.How can i make this happen? Here is the current deployment file and the parameters required for running image :
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: back
spec:
replicas: 1
selector:
matchLabels:
app: back
template:
metadata:
creationTimestamp:
labels:
app: back
spec:
containers:
- name: back
image: back:latest
imagePullPolicy: Never
env:
- name: scihub_username
value: test
- name: scihub_password
value: test
- name: CDINRW_BASE_URL
value: 10.1.40.11:8081/swagger-ui.html
- name: CDINRW_JOB_ID
value: 3fa85f64-5717-4562-b3fc-2c963f66afa6
ports:
- containerPort: 8081
protocol: TCP
volumeMounts:
- mountPath: /data
name: test-volume
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /back
# this field is optional
type: Directory
docker run --rm -v $(pwd):/out_data \
-e scihub_username=test \
-e scihub_password=test \
-e producttype=S2MSI2A \
-e platformname=Sentinel-2 \
-e start_date=2019-06-09T00:00:00.000Z \
-e end_date=2019-06-12T00:00:00.000Z \
-e days_back=7 \
-e footprint="POLYGON((5.8664000 50.3276000,9.4623000 50.3276000,9.4623000 52.5325000,5.8664000 52.5325000,5.8664000 50.3276000))" \
-e max_cloud_cover_percentage=10 \
-e CDINRW_BASE_URL=10.1.40.11:8081/swagger-ui.html \
-e CDINRW_JOB_ID=3fa85f64-5717-4562-b3fc-2c963f66afa6 \
ingestion
For a workload like this, a better design is to deploy a job queue system like RabbitMQ and have two long-running containers (Deployments, since you're using Kubernetes). One of them runs the REST server, and when it receives a request, writes the details of the request into a queue. The second listens to the queue, pulls off messages one at a time, and does the network fetch.
Especially in Kubernetes, this approach has a couple of advantages. You can easily build and test it without requiring Docker or Kubernetes. If you get swamped with requests, they back up in the job queue instead of launching dozens or hundreds of containers. If you see you have a long queue and want to do fetches faster, you can kubectl scale deployment
very easily and run more workers.
If you don't want to go that approach, you should use the Kubernetes API to create a Job, which can restart if it fails, isn't tightly bound to the same node, and doesn't require root-level permission on the host to launch. You should not run docker
commands from inside a Kubernetes pod basically ever. (And since running docker
commands raises the potential of rooting the whole host, you want to be very very careful about doing it from a Web server in any case.)