I have implemented leader election using kubernetes/client-go leader election. I have 2 replicas. For the first time both pod is selecting as leader, but same pod is not elected as leader after this. And the leader election get stopped after some time. I tried to delete one pod, then the new pod that is created is selected as leader. Again once the pod stopped leading, no pod is acting as leader. I am using configmap for resource lock. Please help me to solve the issue.
func NewElectorWithCallbacks(namespace, configMapName, identity string, ttl time.Duration, client cli.CoreV1Interface, callbacks *leaderelection.LeaderCallbacks) (*leaderelection.LeaderElector, error) {
hostname, err := os.Hostname()
if err != nil {
return nil, err
}
broadcaster := record.NewBroadcaster()
broadcaster.StartLogging(log.Printf)
broadcaster.StartRecordingToSink(&cli.EventSinkImpl{Interface: client.Events(namespace)})
recorder := broadcaster.NewRecorder(scheme.Scheme, api.EventSource{Component: identity, Host: hostname})
cmLock := &resourcelock.ConfigMapLock{
Client: client,
ConfigMapMeta: meta.ObjectMeta{
Namespace: namespace,
Name: configMapName,
},
LockConfig: resourcelock.ResourceLockConfig{
Identity: identity,
EventRecorder: recorder,
},
}
if callbacks == nil {
callbacks = NewDefaultCallbacks()
}
config := leaderelection.LeaderElectionConfig{
Lock: cmLock,
LeaseDuration: ttl,
RenewDeadline: ttl / 2,
RetryPeriod: ttl / 4,
Callbacks: *callbacks,
}
return leaderelection.NewLeaderElector(config)
}
config, err = rest.InClusterConfig()
v1Client, err := v1.NewForConfig(config)
callbacks := &leaderelection.LeaderCallbacks{
OnStartedLeading: func(context.Context) {
// do the work
fmt.Println("selected as leader")
// Wait forever
select {}
},
OnStoppedLeading: func() {
fmt.Println("Pod stopped leading")
},
}
elector, err := election.NewElectorWithCallbacks(namespace, electionName, hostname, ttl, v1Client, callbacks)
elector.Run(context.TODO())
You can deploy the pods as statefullsets
& headless service
. Please refer the docs
Why?
Master
and rest are slaves. kubectl get pods -l app=nginx
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 1m
web-1 1/1 Running 0 1m
Even if the pod web-0
restarts, the name or FQDN for pod never change.
web-0.nginx.default.svc.cluster.local
<pod_name>.<service_name>.<namespace>.svc.cluster.local
I have only highlighted few points, please go though the docs completly.