Kafka 与 Confluent Kubernetes Helm Charts = Schema Registry WakeupException

Posted

技术标签:

【中文标题】Kafka 与 Confluent Kubernetes Helm Charts = Schema Registry WakeupException【英文标题】:Kafka with Confluent Kubernetes Helm Charts = Schema Registry WakeupException 【发布时间】:2019-03-25 11:06:32 【问题描述】:

我的主要问题:为什么架构注册表崩溃?

外围问题:如果我为每个 zookeeper/kafka/schema-registry 配置了一个服务器,为什么会为每个启动两个 pod?其他一切看起来基本正确吗?

➜  helm repo update
<snip>

➜  helm install --values values.yaml --name my-confluent-oss confluentinc/cp-helm-charts
<snip>

➜  helm list
NAME                REVISION    UPDATED                     STATUS      CHART                   APP VERSION NAMESPACE
my-confluent-oss    1           Sat Oct 20 19:09:08 2018    DEPLOYED    cp-helm-charts-0.1.0    1.0         default  

➜  kubectl get pods
NAME                                                   READY     STATUS             RESTARTS   AGE
my-confluent-oss-cp-kafka-0                            2/2       Running            0          20m
my-confluent-oss-cp-schema-registry-59d8877584-c2jc7   1/2       CrashLoopBackOff   7          20m
my-confluent-oss-cp-zookeeper-0                        2/2       Running            0          20m

我的values.yaml如下。我已经用helm install --debug --dry-run 对此进行了测试。我只是禁用持久性,设置单个服务器(这是在 VM 中运行的开发设置),并暂时禁用额外的服务,直到我得到基础工作:

cp-kafka:
  brokers: 1
  persistence:
    enabled: false

  cp-zookeeper:
    persistence:
      enabled: false
    servers: 1

cp-zookeeper:
  persistence:
    enabled: false
  servers: 1

cp-kafka-connect:
  enabled: false

cp-kafka-rest:
  enabled: false

cp-ksql-server:
  enabled: false

以下是失败的架构注册表的日志:

➜  kubectl logs my-confluent-oss-cp-schema-registry-59d8877584-c2jc7 cp-schema-registry-server

<snip>
[2018-10-21 00:28:14,738] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,738] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,751] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:28:14,753] INFO Initialized last consumed offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:28:14,756] INFO [kafka-store-reader-thread-_schemas]: Starting (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:28:14,800] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=my-confluent-oss] Resetting offset for partition _schemas-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2018-10-21 00:28:14,821] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:28:14,857] INFO Wait to catch up until the offset of the last message at 7 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2018-10-21 00:28:14,930] INFO Joining schema registry with Kafka-based coordination (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2018-10-21 00:28:14,939] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,940] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,953] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:29:14,945] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
    at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:220)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:63)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:41)
    at io.confluent.rest.Application.createServer(Application.java:169)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
    at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector.init(KafkaGroupMasterElector.java:202)
    at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:215)
    ... 4 more
[2018-10-21 00:29:14,948] INFO Shutting down schema registry (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2018-10-21 00:29:14,949] INFO [kafka-store-reader-thread-_schemas]: Shutting down (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,950] INFO [kafka-store-reader-thread-_schemas]: Stopped (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,951] INFO [kafka-store-reader-thread-_schemas]: Shutdown completed (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,953] INFO KafkaStoreReaderThread shutdown complete. (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,953] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2018-10-21 00:29:14,959] ERROR Unexpected exception in schema registry group processing thread (io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector)
org.apache.kafka.common.errors.WakeupException
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:498)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:284)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161)
    at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:243)
    at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.ensureCoordinatorReady(SchemaRegistryCoordinator.java:207)
    at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.poll(SchemaRegistryCoordinator.java:97)
    at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector$1.run(KafkaGroupMasterElector.java:192)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

我正在使用 minikube 0.30.0 和一个新鲜、干净的 minikube vm:

➜  kubectl version

Client Version: version.InfoMajor:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-22T05:40:33Z", GoVersion:"go1.9.7", Compiler:"gc", Platform:"darwin/amd64"
Server Version: version.InfoMajor:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"

【问题讨论】:

【参考方案1】:

您的架构注册表无法加入您的 Kafka 组。您必须检查配置,您的架构注册表最初需要执行领导者选举,领导者选举可以通过Zookeeper or Kafka 进行。

看起来 Helm 图表使用Kafka leader election 安装架构注册表,您还可以看到您可以使用manually pass the Kafka broker parameter 或从.Values.kafka.bootstrapServers 中选择它,但.bootstrapServers 的值也显示为空。您可以通过简单地运行以下命令来查看部署中的配置值:

$ kubectl get deployment my-confluent-oss-cp-schema-registry -o=yaml

然后您可以将其更改为指向内部 Kubernetes my-confluent-oss-cp-kafka 服务端点:

$ kubectl edit deployment cp-schema-registry

另外,请注意,在撰写本文时,cp-helm-charts 处于开发者预览版中,因此使用它需要您自担风险。

您可以配置的另一个参数是SCHEMA_REGISTRY_KAFKASTORE_INIT_TIMEOUT_CONFIG,因为this is 正是您看到错误的位置。因此,Kafka Schema 注册表可能在尝试连接到 Kafka 存储时超时。 (可能与 minikube 有关)。奇怪的是它应该重试。

【讨论】:

kubectl get deployment cp-schema-registry 不起作用。也许你的意思是kubectl describe deployment my-confluent-oss-cp-schema-registry?这提到了环境变量:SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://my-confluent-oss-cp-kafka-headless:9092。另外,我知道这是一个开发预览,使用风险自负,但我确实希望最基本的启动不会崩溃。 任一工作 get-o=yaml 或只是 describe 我最后的评论是错误的,我正在删除。错误是我的配置,我将 cp-kafka 代理设置为 1,但 cp-kafka.configurationOverrides.offsets.topic.replication.factor 默认为 3,这导致 Kafka 代理失败,从而导致模式注册表失败。我修复了它,它工作正常。问题是我的配置。 啊,很酷,是的,如果您的 Kafka 代理出现故障,它将无法与它对话 :-) 感谢@clay - 也解决了我的问题!

以上是关于Kafka 与 Confluent Kubernetes Helm Charts = Schema Registry WakeupException的主要内容,如果未能解决你的问题,请参考以下文章

Kafka 与 Confluent Kubernetes Helm Charts = Schema Registry WakeupException

如何使用confluent-kafka与密钥库文件

融合平台与 apache kafka [关闭]

如何在没有 Confluent 的情况下使用 Kafka Connect for Cassandra

使用confluent本地安装和使用kafka

Kafka快速入门——Confluent Kafka简介