Netty 内存管理: PooledByteBufAllocator & PoolArena 代码探险[1]

Posted 2021-04-30 代码之思

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Netty 内存管理: PooledByteBufAllocator & PoolArena 代码探险[1]相关的知识，希望对你有一定的参考价值。

我们当前的生产系统是典型的微服务架构，其中的关键部分API网关 xharbor 自2014年初开始研发并在 github 上开源。 xharbor 中的网络层基于 netty ，而架构上重度使用 rxjava 定义模块间的响应式接口。xharbor 需要根据业务规则转发客户端的请求（request）到特定的后端服务，在后端服务处理完成后再将响应（response）发送回客户端，而在转发前后可能还需要进行请求/响应的重写。因此，有效的内存使用对 xharbor 的性能、稳定性和扩展性至关重要。在 xharbor 开发时，我们首先关注的是 netty 的内存管理和泄漏检测。

netty 架构图 —— 摘自 http://netty.io

在上面的 netty 架构图中，可以看到 "Zero-Copy-Capable Rich Byte Buffer"是其核心部分的坚固基石。而内存管理又是这一技术的重点。netty 内存管理的高性能主要依赖于两个关键点：

内存的池化管理
使用堆外直接内存（Direct Memory）

堆外直接内存的优势：Java 网络程序中使用堆外直接内存进行内容发送（Socket读写操作），可以避免了字节缓冲区的二次拷贝；相反，如果使用传统的堆内存（Heap Memory，其实就是byte[]）进行Socket读写，JVM会将堆内存Buffer拷贝一份到堆外直接内存中，然后才写入Socket中。这样，相比于堆外直接内存，消息在发送过程中多了一次缓冲区的内存拷贝。

而池化管理带来的性能提升参见下图，引用自Why Netty (by Norman Maurer at Netflix)

Netty 内存管理: PooledByteBufAllocator & PoolArena 代码探险[1]

https://blog.twitter.com/2013/netty-4-at-twitter-reduced-gc-overhead

如上图图例所展示的， netty 基于两个维度：池化/非池化、Heap Memory/Direct Memory 的组合来确定最终使用的内存管理策略。对 netty 应用首先要能确定netty 到底采用了哪种内存管理策略，才能对各种情况下的性能表现有预期。根据 ByteBufUtil 代码：

String allocType = SystemPropertyUtil.get(            
    "io.netty.allocator.type",    PlatformDependent.isandroid() ? "unpooled" : "pooled");    allocType = allocType.toLowerCase(Locale.US).trim();ByteBufAllocator alloc;if ("unpooled".equals(allocType)) {    alloc = UnpooledByteBufAllocator.DEFAULT;    logger.debug("-Dio.netty.allocator.type: {}", allocType);} else if ("pooled".equals(allocType)) {    alloc = PooledByteBufAllocator.DEFAULT;    logger.debug("-Dio.netty.allocator.type: {}", allocType);} else {    alloc = PooledByteBufAllocator.DEFAULT;    logger.debug("-Dio.netty.allocator.type: pooled (unknown: {})", 
    allocType);}

需要在 netty 应用启动时，设置 JVM参数 -Dio.netty.allocator.type=pooled 设置池化管理策略，而根据 PlatformDependent 中的代码片段:

private static final boolean DIRECT_BUFFER_PREFERRED =    HAS_UNSAFE && !SystemPropertyUtil.getBoolean(        
    "io.netty.noPreferDirect", false);

只要没有设置 -Dio.netty.noPreferDirect=true 并且运行在标准 Oracle JVM（sun.misc.Unsafe存在）中，就会优先使用 Direct Memory，当然还有一个前提是分配了一定数量的Direct Memory，本着省着过日子的想法，一开始 xharbor 中设定了64M的Direct Memory大小，-XX:MaxDirectMemorySize=64M，此时和 netty 相关的 JVM 启动参数为：

-XX:MaxDirectMemorySize=64M -Dio.netty.allocator.type=pooled

运行 xharbor ，通过特定日志输出观察用于Socket读写的 ByteBuf 实例，如下截图所示:

Netty 内存管理: PooledByteBufAllocator & PoolArena 代码探险[1]

xharbor 运行日志截图

WTF! 不看不知道，一看吓一跳，怎么会是 Unpooled 类型的ByteBuf。反复检查了几次启动参数，确认无误。好吧，Talk is cheap，Show me the code，代码是检验一切的标准。找到 PooledByteBufAllocator.newDirectBuffer，摘录如下：

protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {    PoolThreadCache cache = threadCache.get();    PoolArena<ByteBuffer> directArena = cache.directArena;    ByteBuf buf;    
    if (directArena != null) {        
        buf = directArena.allocate(cache, initialCapacity, maxCapacity);    } else {        if (PlatformDependent.hasUnsafe()) {
            buf = UnsafeByteBufUtil.newUnsafeDirectByteBuf(
                this, initialCapacity, maxCapacity);        } else {
            buf = new UnpooledDirectByteBuf(
                this, initialCapacity, maxCapacity);        }    }    return toLeakAwareBuffer(buf);
}

在上面的代码逻辑中，当 directArena 为空时，会直接产生 Unpooled 类型的ByteBuf。难道是 directArena 为空导致的？netty 启动时会详细输出各项配置，翻找之下，果然有所发现：

Netty 内存管理: PooledByteBufAllocator & PoolArena 代码探险[1]

netty 启动时的日志输出

DirectMemory 的 Arena 数量为0，难怪 directArena 为空。继续在 PooledByteBufAllocator 的代码中查找原因，寻获相关代码片段如下：

final int defaultMinNumArena = runtime.availableProcessors() * 2;    
final int defaultChunkSize = DEFAULT_PAGE_SIZE << DEFAULT_MAX_ORDER;    ...... DEFAULT_NUM_DIRECT_ARENA = Math.max(0,    SystemPropertyUtil.getInt(
       "io.netty.allocator.numDirectArenas",       (int) Math.min(           defaultMinNumArena,           PlatformDependent.maxDirectMemory()           / defaultChunkSize / 2 / 3)));

常量 DEFAULT_PAGE_SIZE 和 DEFAULT_MAX_ORDER 在没有特别设置的情况下，缺省值分别为 8192 和 11，因此, defaultChunkSize 的缺省大小是 8192 << 11 = 16M。根据上面的代码，PlatformDependent.maxDirectMemory() 得大于等于 16M * 2 * 3 = 96M 才能使 DEFAULT_NUM_DIRECT_ARENA =1。因此，调整 xharbor 的JVM 启动参数为：

-XX:MaxDirectMemorySize=96M -Dio.netty.allocator.type=pooled

netty 启动时的日志输出(2)

从 xharbor 启动日志中的 netty 初始化信息看到，总算有了一个 DirectMemory Arena 。再次通过 xharbor 日志输出观察用于Socket读写的 ByteBuf 实例，这次总算是 Pooled 类型的 DirectByteBuf。

xharbor 运行日志截图(2)

通过上面的代码探险，xharbor 总算有了一个不错的开始，我们通过设置适当的 DirectMemory 大小（>=96M）和内存管理策略(io.netty.allocator.type=pooled)使得 xharbor 用上了池化的堆外直接内存。但在一个高并发、重负载系统中，一旦出现内存泄漏，往往就意味着系统崩溃这样的致命问题，具体到 netty 中的ByteBuf，由于它使用了引用计数方式管理生命周期，使得问题排查更为复杂。那么：

如何才能及时无误的查看 xharbor 中是否存在泄漏？
netty 中有什么的便利的设施供我们使用吗?

让我们把问题留到本系列的下一篇吧！

参考

Why do we need to manually handle reference counting for Netty ByteBuf if JVM GC is still in place?
Buffer ownership in Netty 4: How is buffer life-cycle managed?

还请点击如下【阅读原文】查看【简书】上的本系列原文

以上是关于Netty 内存管理: PooledByteBufAllocator & PoolArena 代码探险[1]的主要内容，如果未能解决你的问题，请参考以下文章

支撑百万级并发，Netty如何实现高性能内存管理

Netty源码分析（七） PoolChunk

Netty源码_内存管理(jemalloc4)

Netty 系列笔记之内存管理

8.池化内存分配

看完这篇还不清楚Netty的内存管理，那我就哭了！