I am working on a project that sends notifications to users in the system. Each user has to define a webhook URL which is called by our services. There are multiple notifications and the time each of them appears is not stated, it depends on the user activity in the system.
Our webhook caller service has to call the user webhook URL. We have some requirements - like http response status, timeout, etc. If any of the requirements are not met, we have to retry to call to the webhook URL. The first retry is after 30 seconds, the next is after 5 minutes, next is at 15 and from there on we double the previous delay. Haven't thought when we are going to end up the retry process for a event (most probly after 24h of failures).
What is the best way to do this? Our services are running in a cluster environment which autoscales. I thought of polling a database but not sure how to do it. All the resources I have found are of defined recurring task. However, in our case we need somekind of dynamic recurring tasks.
The requirements for our approach are to be highly-available and be able to handle really high throughput. Plus would be to have some benefits for the autoscaling (we run on Kubernetes).