反序列化Avro序列化Kafka流的问题
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了反序列化Avro序列化Kafka流的问题相关的知识,希望对你有一定的参考价值。
我试图实现商店时收到异常。我正在运行Kafka 1.0,Confluent的Schema Registry 4.0和Avro 1.8.2。我使用Avro的maven插件生成了Pojo,并使用Confluent maven插件将模式部署到Confluent服务器。我能够为STREAM1主题生成一条消息。以下是设置流的代码:
Properties properties = new Properties();
properties.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-pipe");
properties.put(StreamsConfig.CLIENT_ID_CONFIG, "cleant-id");
properties.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "http://localhost:9092");
properties.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081");
properties.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
properties.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
StreamsBuilder builder = new StreamsBuilder();
Serde<T> pojoSerde = new SpecificAvroSerde<>();
final Map<String, String> serdeConfig = Collections.singletonMap(
AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081");
pojoSerde.configure(serdeConfig, false);
Consumed<String, Pojo> consumed = Consumed.with(Serdes.String(), pojoSerde);
KStream<String, Pojo> source = builder.stream(TopicName.STREAM1.toString(), consumed);
KTable<String, Long> storePojoCount = source
.groupBy((key, value) -> key)
.count(Materialized.as(StoreName.STORE_WORD_COUNT.toString()));
Produced<String, Long> produced = Produced.with(Serdes.String(), Serdes.Long());
storePojoCount.toStream().to(TopicName.STREAM2.toString(), produced);
KafkaStreams streams = new KafkaStreams(builder.build(), properties);
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
streams.start();
这产生了以下例外。
Exception in thread "cleant-id-StreamThread-2" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately.
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:74)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:91)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:117)
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:546)
at org.apache.kafka.streams.processor.internals.StreamThread.addRecordsToTasks(StreamThread.java:920)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:821)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
如何配置此SpecificAvroSerde以成功反序列化流?
答案
问题是Materialized对象没有适当的反序列化器 - Avro正在尝试反序列化KTable值,因为Avro是默认值反序列化器。它无法这样做,因为KTable值实际上是Longs。
使用正确的反序列化器创建物化对象将解决该问题。
protected <K, V> Materialized<K, V, KeyValueStore<Bytes, byte[]>> persistentStore(StoreName storeName, Serde<K> keyType, Serde<V> valueType) {
KeyValueBytesStoreSupplier storeSupplier = Stores.persistentKeyValueStore(storeName.toString());
return Materialized.<K, V>as(storeSupplier).withKeySerde(keyType).withValueSerde(valueType);
}
任何商店供应商都可以在这里使用 - 这只是符合我需求的供应商。
以上是关于反序列化Avro序列化Kafka流的问题的主要内容,如果未能解决你的问题,请参考以下文章
如何使用来自 Kafka 的 Python 解码/反序列化 Avro
使用 Apache Beam 反序列化 Kafka AVRO 消息