Scaling cron architecture with kubernetes

2/11/2019

I have a cloud distributed database (AWS RDS - PostGres) with a table of sources. Sources can be a web page or a social media account.

I have a cron job on my service that will go through each source and get updated information like comments or stats.

Sometimes if specific conditions are met, another action can be triggered, i.e. if an instagram post hits 1000 likes, comment with a string, or if a blog creates a new post, send an email out to subscribers.

I would like to scale my service horizontally through docker and k8s, if I scale to two services, there will be two cron jobs, and any specific action could be sent twice. I do not want n emails to be sent for n instances I've scaled

What is the correct architecture to handle this?

-- PGT
architecture
database-design
docker
kubernetes
scalability

1 Answer

2/11/2019

If you want to horizontally scale the whole stack, split your domain by some reasonable key (say creation date) into N partitions, and have each partition be a full stack.

If you are concerned with scaleability, then you probably want to separate your stack into multiple layers (source refresher workers, action handlers, etc), connected by work queues so that any particular action can be scaled independently... But I'd start with a straight domain partition at first.

-- Rob Conklin
Source: StackOverflow