CDC 与 docker 中的 debezium

Posted

技术标签:

【中文标题】CDC 与 docker 中的 debezium【英文标题】:CDC with debezium in docker 【发布时间】:2021-07-23 14:00:36 【问题描述】:

目前我正在尝试在 docker 中使用 debezium、kafka connect 和 kafka 设置 CDC。 我一直在关注本指南:https://debezium.io/documentation/reference/tutorial.html 我跳过了启动 MsSQL 数据库部分,因为我有一个本地 SQL Server 数据库,该数据库的配置如链接中所示: https://debezium.io/documentation/reference/1.2/connectors/sqlserver.html#setting-up-sqlserver

我当前的 docker compose 文件如下所示:

version: '2'
    services:
      zookeeper:
        image: "debezium/zookeeper:$DEBEZIUM_VERSION"
        ports:
         - 2181:2181
         - 2888:2888
         - 3888:3888
         
      kafka:
        image: "debezium/kafka:$DEBEZIUM_VERSION"
        ports:
          - 9092:9092
        links:
          - zookeeper
        environment:
          - ZOOKEEPER_CONNECT=zookeeper:2181 
    
      connect:
        image: "debezium/connect:$DEBEZIUM_VERSION"
        ports: 
          - 8083:8083
        links:
          - kafka
          - zookeeper
        environment:
          - GROUP_ID=1
          - CONFIG_STORAGE_TOPIC=my_connect_configs
          - OFFSET_STORAGE_TOPIC=my_connect_offsets
          - STATUS_STORAGE_TOPIC=my_source_connect_statuses

我在单独的文件中有一个环境变量

 DEBEZIUM_VERSION=1.5

当我尝试运行此撰写文件时,它会返回:

connect_1    | 2021-04-30 07:41:41,568 WARN   ||  [AdminClient clientId=adminclient-1] Connection to node -1 (/0.0.0.0:9092) could not be established. Broker may not be available.   [org.apache.kafka.clients.NetworkClient]
    connect_1    | 2021-04-30 07:41:42,673 WARN   ||  [AdminClient clientId=adminclient-1] Connection to node -1 (/0.0.0.0:9092) could not be established. Broker may not be available.   [org.apache.kafka.clients.NetworkClient]
    connect_1    | 2021-04-30 07:41:43,879 WARN   ||  [AdminClient clientId=adminclient-1] Connection to node -1 (/0.0.0.0:9092) could not be established. Broker may not be available.   [org.apache.kafka.clients.NetworkClient]
    connect_1    | 2021-04-30 07:41:44,844 INFO   ||  [AdminClient clientId=adminclient-1] Metadata update failed   [org.apache.kafka.clients.admin.internals.AdminMetadataManager]
    connect_1    | org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1619768504843, tries=1, nextAllowedTryMs=1619768504944) timed out at 1619768504844 after 1 attempt(s)
    connect_1    | Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: fetchMetadata
    connect_1    | 2021-04-30 07:41:44,845 INFO   ||  App info kafka.admin.client for adminclient-1 unregistered   [org.apache.kafka.common.utils.AppInfoParser]
    connect_1    | 2021-04-30 07:41:44,846 INFO   ||  [AdminClient clientId=adminclient-1] Metadata update failed   [org.apache.kafka.clients.admin.internals.AdminMetadataManager]
    connect_1    | org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1619768534844, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
    connect_1    | Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited. Call: fetchMetadata
    connect_1    | 2021-04-30 07:41:44,853 INFO   ||  Metrics scheduler closed   [org.apache.kafka.common.metrics.Metrics]
    connect_1    | 2021-04-30 07:41:44,853 INFO   ||  Closing reporter org.apache.kafka.common.metrics.JmxReporter   [org.apache.kafka.common.metrics.Metrics]
    connect_1    | 2021-04-30 07:41:44,853 INFO   ||  Metrics reporters closed   [org.apache.kafka.common.metrics.Metrics]
    connect_1    | 2021-04-30 07:41:44,853 ERROR  ||  Stopping due to error   [org.apache.kafka.connect.cli.ConnectDistributed]
    connect_1    | org.apache.kafka.connect.errors.ConnectException: Failed to connect to and describe Kafka cluster. Check worker's broker connection and security properties.
    connect_1    |  at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:70)
    connect_1    |  at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:51)
    connect_1    |  at org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:95)
    connect_1    |  at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:78)
    connect_1    | Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1619768504842, tries=1, nextAllowedTryMs=1619768504943) timed out at 1619768504843 after 1 attempt(s)
    connect_1    |  at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
    connect_1    |  at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
    connect_1    |  at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
    connect_1    |  at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
    connect_1    |  at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:64)
    connect_1    |  ... 3 more
    connect_1    | Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1619768504842, tries=1, nextAllowedTryMs=1619768504943) timed out at 1619768504843 after 1 attempt(s)
    connect_1    | Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listNodes
    docker_connect_1 exited with code 2

编辑: 我在尝试注册 sql server 数据库时遇到了一个新问题。我的连接字符串:

curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d " \"name\": \"store-connector\", \"config\":  \"connector.class\": \"io.debezium.connector.sqlserver.SqlServerConnector\", \"database.hostname\": \"localhost\", \"database.port\": \"1433\", \"database.user\": \"sa\", \"database.password\": \"*********\", \"database.dbname\": \"DebeziumKafkaDataBase\", \"database.server.name\": \"MSSQLSERVER\", \"table.whitelist\": \"dbo.Customers\", \"database.history.kafka.bootstrap.servers\": \"kafka:9092\", \"database.history.kafka.topic\": \"dbhistory.MSSQLSERVER\"  "

在 debezium 连接容器中出现错误:

Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host localhost, port 1433 has failed. Error: "Connection refused (Connection refused). Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".

我尝试向防火墙添加规则以允许端口 1433 并在 SQL Server 配置管理器中为我的数据库设置协议,如下所示: JDBC connection failed, error: TCP/IP connection to host failed

我不知道我是否应该为此打开新问题,所以我只是在这里发布。

【问题讨论】:

你还没有告诉 Debezium 容器如何连接到 Kafka 【参考方案1】:

您应该将 ADVERTISED_LISTENERS=kafka:9092 添加到 Kafka 服务并将 BOOTSTRAP_SERVERS=kafka:9092 添加到 Debezium 服务

【讨论】:

感谢您的回答!像魅力一样工作。 很抱歉打扰你,但你能看看问题的编辑部分吗? 既然你已经接受了答案,还有一个新问题,请创建一个全新的帖子 无论如何,你似乎没有明白Docker中指的localhost就是容器本身。你想要host.docker.internal:1433

以上是关于CDC 与 docker 中的 debezium的主要内容,如果未能解决你的问题,请参考以下文章

Kafka Connect JDBC 与 Debezium CDC

CDC 与 WSO2 Streaming Integrator 和 Postgres DB

没有 CDC 的 MFC 字符串宽度

MFC中的DC CDC HDC由来由去理解

数据库中的成功快照表。但是当有事件插入、更新或删除时不能CDC

通过 CDC COM 端口与 Arduino 进行 Windows COM 通信