使用独立模式 Kafka-connect 将 Postgresql 的数据捕获更改为 kafka 主题
Posted
技术标签:
【中文标题】使用独立模式 Kafka-connect 将 Postgresql 的数据捕获更改为 kafka 主题【英文标题】:Change-data-capture from Postgres SQL to kafka topics using standalone mode Kafka-connect 【发布时间】:2021-07-03 05:29:07 【问题描述】:我一直在尝试使用以下命令 /bin connect-standalone.properties config/connect-standalone.properties postgres.sproperties 从 postgres sql 获取数据到 kafka 主题,但我面临着几个问题 这是我的 postgres.properties 文件的内容:
name=cdc_demo
connector.class=io.debezium.connector.postgresql.PostgresConnector
tasks.max=1
plugin.name=decoderbufs
slot.name=debezium
slot.drop_on_stop=false
database.hostname=localhost
database.port=5432
database.user=postgres
database.password=XXXXX
database.dbname=snehildb
time.precision.mode=adaptive
database.sslmode=disable
database.server.name=localhost:5432/snehildb
table.whitelist=public.students
decimal.handling.mode=precise
topic.creation.enable=true`
这里是 connect-standalone.properties 的内容:
# These are defaults. This file just demonstrates how to override some settings.
bootstrap.servers=localhost:9092
# The converters specify the format of data in Kafka and how to translate it into Connect data. Every
Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into
Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the
converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for
plugins
# (connectors, converters, transformations). The list should consist of top level directories that
include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and
their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/home/azureuser/plugins
我收到了几条警告,但我无法解决以下三个主要错误:
ERROR Postgres server wal_level property must be "logical" but is: replica
(io.debezium.connector.postgresql.PostgresConnector:101)
(org.apache.kafka.common.config.AbstractConfig:361)
ERROR Failed to create job for config/postgres.properties
(org.apache.kafka.connect.cli.ConnectStandalone:110)
ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:121)
我是 Kafka 新手,如果有人能指出我的错误会非常有帮助。
【问题讨论】:
【参考方案1】:Debezium 要求 wal_level 为 logical
:
https://www.postgresql.org/docs/9.6/runtime-config-wal.html
在课堂上看一下 postgres 连接器的内部:
debeizum repo 中的 io.debezium.connector.postgresql.PostgresConnector.java:
https://github.com/debezium/debezium/blob/master/debezium-connector-postgres/src/main/java/io/debezium/connector/postgresql/PostgresConnector.java
【讨论】:
您好 senjin.hajrulahovic,感谢您指出这一点。我在 postgres.conf 中更改了此属性,前两个错误现在消失了,但我仍然收到连接器错误,我会尝试解决。非常感谢您的帮助!。以上是关于使用独立模式 Kafka-connect 将 Postgresql 的数据捕获更改为 kafka 主题的主要内容,如果未能解决你的问题,请参考以下文章
kafka-connect JDBC PostgreSQL Sink Connector 显式定义 PostgrSQL 模式(命名空间)
Kafka-Connect:在分布式模式下创建新连接器就是创建新组
使用本地 kafka-connect 集群连接远程数据库的连接超时