Apache Flume:无法提交事务。达到堆空间限制

Posted

技术标签:

【中文标题】Apache Flume:无法提交事务。达到堆空间限制【英文标题】:Apache Flume: cannot commit transaction. Heap space limit reached 【发布时间】:2014-11-21 17:06:21 【问题描述】:

我正在尝试使用 Flume 将一些数据流式传输到 HDFS,其中单个代理配置为具有 netcat 源、内存通道和 HDFS 接收器。

配置如下:

a1.sources = src1
a1.channels = ch1
a1.sinks = snk1

# SOURCES CONFIGURATION
a1.sources.src1.type = netcat
a1.sources.src1.bind = 0.0.0.0
a1.sources.src1.port = 99999
a1.sources.src1.ack-every-event = false

# SOURCE -> CHANNEL
a1.sources.src1.channels = ch1

# SINKS' CONFIGURATION
a1.sinks.snk1.type = hdfs
a1.sinks.snk1.hdfs.path = /somepath
a1.sinks.snk1.hdfs.writeFormat = Text
a1.sinks.snk1.hdfs.fileType = DataStream
a1.sinks.snk1.hdfs.inUseSuffix = .tmp
a1.sinks.snk1.hdfs.filePrefix = prefix_file
a1.sinks.snk1.hdfs.batchSize = 75000
a1.sinks.snk1.hdfs.rollInterval = 120
a1.sinks.snk1.hdfs.rollCount = 0
a1.sinks.snk1.hdfs.idleTimeout = 0
#128MB for each file maximum = 128 * 1024 (MB) * 1024 (KB) = ...
a1.sinks.snk1.hdfs.rollSize = 134217728

a1.sinks.snk1.hdfs.threadsPoolSize = 25

# SINK <- CHANNEL
a1.sinks.snk1.channel = ch1

# CHANNELS' CONFIGURATION
a1.channels.ch1.type = memory
a1.channels.ch1.capacity = 5000000
a1.channels.ch1.transactionCapacity = 100000
#412MB of byte capacity = 412 * 1024 * 1024 byte
#a1.channels.ch1.byteCapacity = 432013312

但是,如果我发送超过特定带宽的消息,则会出现以下异常:

2014-11-21 05:48:07,035 (netcat-handler-0) [WARN - org.apache.flume.source.NetcatSource$NetcatSocketHandler.processEvents(NetcatSource.java:407)] Error processing event. Exception follows.
org.apache.flume.ChannelException: Unable to put event on required channel: org.apache.flume.channel.MemoryChannelname: ch1
        at org.apache.flume.channel.ChannelProcessor.processEvent(ChannelProcessor.java:275)
        at org.apache.flume.source.NetcatSource$NetcatSocketHandler.processEvents(NetcatSource.java:394)
        at org.apache.flume.source.NetcatSource$NetcatSocketHandler.run(NetcatSource.java:321)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.ChannelException: Cannot commit transaction. Heap space limit of 3456106reached. Please increase heap space allocated to the channel as the sinks may not be keeping up with the sources
        at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:123)
        at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
        at org.apache.flume.channel.ChannelProcessor.processEvent(ChannelProcessor.java:267)
        ... 7 more

在我的 conf/flume-env.sh 中,我无法更改堆空间的值:

JAVA_OPTS="-Xms256m -Xmx512m -Dcom.sun.management.jmxremote"

异常中的堆空间大小应以字节表示,这意味着我有一个 3,3MB 的堆空间,非常低,但我不明白该值来自哪里......! 我该如何解决这个问题?非常感谢您!

【问题讨论】:

试试改成JAVA_OPTS="-Xms512m -Xmx1024m -Dcom.sun.management.jmxremote" 【参考方案1】:

你有几个nobs 可以用来让这个正常工作:

    增加byteCapacity:a1.channels.ch1.byteCapacity = 6912212。 按照上述评论 (JAVA_OPTS="-Xms512m -Xmx1024m -Dcom.sun.management.jmxremote") 中的建议增加内存可能是最好的选择。原因是默认的byteCapacity 是进程最大内存的 80%,这已经在消耗大量进程内存了。 缩小byteCapacityBufferPercentage 从而减少页眉空间。

【讨论】:

感谢您的回答。我认为 byteCapacity 已经很高了,因为它设置为 412MB。我会尝试增加 Java 内存并让你知道会发生什么。 好的,我用 JAVA_OPTS="-Xms512m -Xmx1024m -Dcom.sun.management.jmxremote" 增加了 Java 内存,现在可以肯定它工作得更好了。我尝试了比以前更高的带宽,它似乎工作。谢谢!

以上是关于Apache Flume:无法提交事务。达到堆空间限制的主要内容,如果未能解决你的问题,请参考以下文章

Flume 事务

Flume 事务

例外如下。 org.apache.flume.FlumeException:无法在flume twitter分析中加载源类型

错误 node.PollingPropertiesFileConfigurationProvider:无法加载配置数据。例外如下。 org.apache.flume.FlumeException:

分布式事务——三阶段提交

2 3次更新后立马提交事务导致 更新十分慢