My Main Question: Why is the schema-registry crashing?
Peripheral Question: Why are two pods launching for each of zookeeper/kafka/schema-registry if I've configured one server for each? Does everything thing else look basically right?
➜ helm repo update
<snip>
➜ helm install --values values.yaml --name my-confluent-oss confluentinc/cp-helm-charts
<snip>
➜ helm list
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
my-confluent-oss 1 Sat Oct 20 19:09:08 2018 DEPLOYED cp-helm-charts-0.1.0 1.0 default
➜ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-confluent-oss-cp-kafka-0 2/2 Running 0 20m
my-confluent-oss-cp-schema-registry-59d8877584-c2jc7 1/2 CrashLoopBackOff 7 20m
my-confluent-oss-cp-zookeeper-0 2/2 Running 0 20m
My values.yaml
is as follows. I've tested this out with helm install --debug --dry-run
. I'm just disabling persistence, setting a single server (this is a development setup for running in a VM), and disabling the extra services for the moment until I get the basics working:
cp-kafka:
brokers: 1
persistence:
enabled: false
cp-zookeeper:
persistence:
enabled: false
servers: 1
cp-zookeeper:
persistence:
enabled: false
servers: 1
cp-kafka-connect:
enabled: false
cp-kafka-rest:
enabled: false
cp-ksql-server:
enabled: false
Here are the logs for the failing schema-registry:
➜ kubectl logs my-confluent-oss-cp-schema-registry-59d8877584-c2jc7 cp-schema-registry-server
<snip>
[2018-10-21 00:28:14,738] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,738] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,751] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:28:14,753] INFO Initialized last consumed offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:28:14,756] INFO [kafka-store-reader-thread-_schemas]: Starting (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:28:14,800] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=my-confluent-oss] Resetting offset for partition _schemas-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2018-10-21 00:28:14,821] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:28:14,857] INFO Wait to catch up until the offset of the last message at 7 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2018-10-21 00:28:14,930] INFO Joining schema registry with Kafka-based coordination (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2018-10-21 00:28:14,939] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,940] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,953] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:29:14,945] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:220)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:63)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:41)
at io.confluent.rest.Application.createServer(Application.java:169)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector.init(KafkaGroupMasterElector.java:202)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:215)
... 4 more
[2018-10-21 00:29:14,948] INFO Shutting down schema registry (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2018-10-21 00:29:14,949] INFO [kafka-store-reader-thread-_schemas]: Shutting down (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,950] INFO [kafka-store-reader-thread-_schemas]: Stopped (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,951] INFO [kafka-store-reader-thread-_schemas]: Shutdown completed (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,953] INFO KafkaStoreReaderThread shutdown complete. (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,953] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2018-10-21 00:29:14,959] ERROR Unexpected exception in schema registry group processing thread (io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector)
org.apache.kafka.common.errors.WakeupException
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:498)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:284)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:243)
at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.ensureCoordinatorReady(SchemaRegistryCoordinator.java:207)
at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.poll(SchemaRegistryCoordinator.java:97)
at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector$1.run(KafkaGroupMasterElector.java:192)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I'm using minikube 0.30.0 and a fresh, clean minikube vm:
➜ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-22T05:40:33Z", GoVersion:"go1.9.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Your schema registry can't join your Kafka group. You'll have to check the configs, your schema registry needs to perform a leader election initially and that leader election could be either through Zookeeper or Kafka.
Looks like the Helm chart installs the schema registry using Kafka leader election, and you can also see that you can manually pass the Kafka broker parameter or it picks it from .Values.kafka.bootstrapServers, but also the value for .bootstrapServers appears empty. You can see what config value is in your deployment by simply running something like:
$ kubectl get deployment my-confluent-oss-cp-schema-registry -o=yaml
Then you can change it to point the internal Kubernetes my-confluent-oss-cp-kafka service endpoint with:
$ kubectl edit deployment cp-schema-registry
Also, note that as of this writing the cp-helm-charts are in developer preview so use it at your own risk.
The other parameter you can configure is SCHEMA_REGISTRY_KAFKASTORE_INIT_TIMEOUT_CONFIG since this is exactly where you are seeing the error. So the Kafka Schema registry maybe timing out while trying to connect to the Kafka store. (maybe related to minikube). What's kind of odd is that it should retry.