I have created a REST API - in a few words, my client hits a particular URL and she gets back a JSON response.
Internally, quite a complicated process starts when the URL is hit, and there are various services involved as a microservice architecture is being used.
I was observing some performance bottlenecks and decided to switch to a message queue system. The idea is that now, once the user hits the URL, a request is published on internal message queue waiting for it to be consumed. This consumer will process and publish back on a queue and this will happen quite a few times until finally, the same node servicing the user will receive back the processed response to be delivered to the user.
An asynchronous "fire-and-forget" pattern is now being used. But my question is, how can the node servicing a particular person remember who it was servicing once the processed result arrives back and without blocking (i.e. it can handle several requests until the response is received)? If it makes any difference, my stack looks a little like this: TomCat, Spring, Kubernetes and RabbitMQ.
In summary, how can the request node (whose job is to push items on the queue) maintain an open connection with the client who requested a JSON response (i.e. client is waiting for JSON response) and receive back the data of the correct client?
You have few different scenarios according to how much control you have on the client.
If the client behaviour cannot be changed, you will have to keep the session open until the request has not been fully processed. This can be achieved employing a pool of workers (futures/coroutines, threads or processes) where each worker keeps the session open for a given request.
This method has few drawbacks and I would keep it as last resort. Firstly, you will only be able to serve a limited amount of concurrent requests proportional to your pool size. Lastly as your processing is behind a queue, your front-end won't be able to estimate how long it will take for a task to complete. This means you will have to deal with long lasting sessions which are prone to fail (what if the user gives up?).
If the client behaviour can be changed, the most common approach is to use a fully asynchronous flow. When the client initiates a request, it is placed within the queue and a Task Identifier is returned. The client can use the given TaskId
to poll for status updates. Each time the client requests updates about a task you simply check if it was completed and you respond accordingly. A common pattern when a task is still in progress is to let the front-end return to the client the estimated amount of time before trying again. This allows your server to control how frequently clients are polling. If your architecture supports it, you can go the extra mile and provide information about the progress as well.
Example response when task is in progress:
{"status": "in_progress",
"retry_after_seconds": 30,
"progress": "30%"}
A more complex yet elegant solution would consist in using HTTP callbacks. In short, when the client makes a request for a new task it provides a tuple (URL, Method) the server can use to signal the processing is done. It then waits for the server to send the signal to the given URL. You can see a better explanation here. In most of the cases this solution is overkill. Yet I think it's worth to mention it.
One option would be to use DeferredResult provided by spring but that means you need to maintain some pool of threads in request serving node and max no. of active threads will decide the throughput of your system. For more details on how to implement DeferredResult refer this link https://www.baeldung.com/spring-deferred-result