McRouter on Google Cloud cluster. How to deal with node upgrade?

1/17/2019

We are running a 3-node mcrouter/memcached kubernetes deployment (through helm) on a Google Cloud cluster. We are using a cluster instead of a single VM to make our web app (which uses memcache for sessions) resilient to node failure.

When updating nodes kubernetes evicts the pods one by 1 and generates new ones. Since memcache is an in-memory store, these new pods are created empty of data. The mcrouter routes which we currently use are not optimal for this situation as evidenced by intermittent session failures during an upgrade.

As far as I understand, there are 2 ways to deal with this:

  1. Use WarmUpRoute
  2. Use MissFailoverRoute

If I want to use WarmUpRoute then I need to do this:

  1. Prior to node upgrade switch out the current config to a config that designates one of the nodes as a "warm" server and the other two as "cold" servers.
  2. Perform the node upgrade on the two cold servers.
  3. Allow clients to query the cache for a few days, which will slowly cause the cold servers to mirror the warm server as cache misses cause synchronization.
  4. Switch out this WarmUpRoute config with another WarmUpRoute config, in which the server previously designated as "warm" is designated "cold" and vice-versa.
  5. Repeat step 3
  6. Finally when all servers are synchronized revert back to my original config.

If I want to use a MissFailoverRoute then I need to do this:

  1. Use a config which designates one of the memcache nodes as a failover node.
  2. Update the two non-failover nodes
  3. Update the failover node

Do I understand this correctly? It seems that the second option is much simpler. Are there any advantages to the WarmUpRoute method? Is there a third option that would work better than these two?

-- Mike Furlender
google-cloud-platform
kubernetes
memcached

0 Answers