I'm building a platform to run ML jobs. Jobs will be started from an interface. I'm making a service for each type of jobs. Some times, a service S1 might require to first make a request to another service S2 and get its output before running its own job.
Each service is split into 2 Kubernetes deployment:
Here is the flow:
To answer to the client, D2 has 2 options:
What do you think abouut this architecture ? Does the usage of PubSub makes sense ? Also does it make sense to split each service into 2 deployment (1 that deals with request, 1 that runs the actual job ) ?
- interface generates a PubSub message to a topic T1 D1 pulls message
- from T1 and persist a request to a database
If there's only one database, I'm not sure I see much advantage in using a topic (implying pub/sub). Another approach would be to use a queue: the interface creates jobs into the queue, then you can have any number of workers processing it. Depending on the situation you may not even need the database at all - if all the data needed can be in the message in the queue.
use a callback provided by the client to make a POST request
That's better if you can do it, on the assumption that there's only one consumer for the event; pub/sub is more for broadcasting out to multiple consumers. Polling works but is really inefficient and has limits on how much it can scale.
Also does it make sense to split each service into 2 deployment (1 that deals with request, 1 that runs the actual job ) ?
Having separate deployables make sense if they are built by different teams and have a different release cadence or if you need to scale them out independently, otherwise it may not be necessary.