consul keeps crashing when deployed on google kubernets engine in asia-south1 (Mumbai) region, but works in other regions

12/12/2019

I have terraform scripts that creates google kubernetes engine, deploys consul (bitnami) on this cluster and then insert some key-values into consul KV store.

This terraform script works fine in various regions I tried except region asia-south1 (Mumbai, India). In asia-south1 region, consul never initializes and keeps crashing and container restarts every few minutes. I can see following errors in StackDrive logs.

[ERR] agent: failed to sync remote state: No cluster leader\n"

[ERR] agent: Coordinate update error: No cluster leader\n"

[ERR] http: Request GET /v1/operator/raft/configuration, error: No cluster leader from=127.0.0.1:39314\n"

I suspect there are few differences in underlying infrastructure in asia-south1 data center. Has anyone faced this issue?

-- Satish Nikam
consul
google-kubernetes-engine

1 Answer

12/12/2019

I checked Google Cloud Status Dashboard to see if it reported any outages in asia-south1 region or other issue, but I didn’t find any abnormal behaviors that may cause this issue.

As the operations are unique in that they span all three scopes: global resources, regional operations, and zonal operations, then if you terraform script works fine in various regions it should works fine in the asia-south1 region too. Sometimes there are not enough resources in one region so you will have some limitations.

If you are able to replicate the behavior, to investigate more on this issue, we need to access your project, then please contact the GCP support by creating a technical case (For free user this link) or report it as a defect using Google Cloud Platform's issue tracker.

The error log in stackdriver can be related to pods error and this kubernetes official document could be helpful.

-- Ahmad P
Source: StackOverflow