we are running multiple kubernetes pods concurrently on a cluster and we have been facing all sorts of connectivity issues with ORACLE, Informatica and other services.
out of multiple pods we ran, few of the pod just sit on the cluster after completing their task without writing logs to DB. when went through splunk logs of hanging pods(not really hanging because we were able to exec and run other things) or pods having connectivity issues we consistently saw this error followed by ORA-03113/03114 errors
can anyone help me understand this error
INFO process_step:>1<; message:>Execute: Run successful<
2021-09-30 18:50:37.898 [ERROR][23511] customresource.go 136: Error updating resource Key=IPAMBlock(10-0-5-64-26) Name="10-0-5-64-26" Resource="IPAMBlocks" Value=&v3.IPAMBlock{TypeMeta:v1.TypeMeta{Kind:"IPAMBlock", APIVersion:"crd.projectcalico.org/v1"}, ObjectMeta:v1.ObjectMeta{Name:"10-0-5-64-26", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"425239779", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v3.IPAMBlockSpec{CIDR:"10.0.5.64/26", Affinity:(*string)(0xc000476000), StrictAffinity:false, Allocations:[]*int{(*int)(0xc00068d388), (*int)(0xc00068d450), (*int)(nil), (*int)(0xc00068d490), (*int)(0xc00068d398), (*int)(0xc00068d3f0), (*int)(0xc00068d498), (*int)(0xc00068d390), (*int)(0xc00068d420), (*int)(0xc00068d4a0), (*int)(0xc00068d3d0), (*int)(nil), (*int)(0xc00068d308), (*int)(0xc00068d3b0), (*int)(0xc00068d310), (*int)(0xc00068d320), (*int)(nil), (*int)(0xc00068d4b8), (*int)(nil), (*int)(nil), (*int)(0xc00068d460), (*int)(0xc00068d4a8), (*int)(0xc00068d458), (*int)(0xc00068d3c8), (*int)(0xc00068d440), (*int)(nil), (*int)(0xc00068d428), (*int)(0xc00068d3b8), (*int)(0xc00068d470), (*int)(0xc00068d408), (*int)(0xc00068d418), (*int)(0xc00068d448), (*int)(0xc00068d438), (*int)(0xc00068d4b0), (*int)(0xc00068d3a8), (*int)(0xc00068d318), (*int)(0xc00068d430), (*int)(0xc00068d3d8), (*int)(0xc00068d410), (*int)(0xc00068d478), (*int)(0xc00068d3e0), (*int)(0xc00068d3c0), (*int)(0xc00068d358), (*int)(0xc00068d330), (*int)(0xc00068d340), (*int)(0xc00068d3f8), (*int)(0xc00068d328), (*int)(0xc00068d400), (*int)(0xc00068d338), (*int)(0xc00068d480), (*int)(0xc00068d350), (*int)(0xc00068d488), (*int)(0xc00068d468), (*int)(0xc00068d348), (*int)(0xc00068d360), (*int)(nil), (*int)(0xc00068d368), (*int)(0xc00068d3e8), (*int)(0xc00068d370), (*int)(0xc00068d378), (*int)(0xc00068d380), (*int)(0xc00068d3a0), (*int)
(nil), (*int)(nil)}, Unallocated:[]int{2, 25, 11, 18, 19, 55, 62, 16, 63
****Bunch of node IP's**************** then
error=Operation cannot be fulfilled on ipamblocks.crd.projectcalico.org "10-0-5-64-26": the object has been modified; please apply your changes to the latest version and try again
DBD::Oracle::db do failed: ORA-03113: end-of-file on communication channel
In short, the request to Calico to change the allocation for pod IP addresses has failed. You can learn more about Calico IPAM here.
The last line is synonym to connection timeout.