Hbase 之 There is insufficient memory

Posted Hbase工作笔记

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hbase 之 There is insufficient memory相关的知识,希望对你有一定的参考价值。

再不点蓝字关注,机会就要飞走了哦!

Hbase 之 There is insufficient memory


1. 现象


线下有一个5个HRegionServer节点的小Hbase集群(小内存),昨日全挂掉了。重启,启动不了。

2. 原因


查看错误日志,发现出发点是因为GC超时导致。
首先是其中某一个节点发生GC,连接ZK超时,连接关闭而挂掉。
挂掉后发生了Region迁移,如下(可右测滑动):

Adding moved region record: 41810a8da6ae1fa59a732d24a521d55c to yq-hadoop***,60020,1520389720296 as of 405597
Adding
moved region record: 9cfe6192b4c7074699177ff2a5b72e70 to yq-hadoop***,60020,1520389720296 as of 1605


Region 迁移之后,其他节点负载增加,GC更加严重,相继挂掉。

我们知道,hbase 是比较耗内存的,小内存是其软肋。其实除了并发有点高之外,主要原因还是因为测试集群机身内存太小了(只有7GB),加之其他应用,HBASE 堆栈无法分配过多内存,导致GC严重。

3. 重启失败


报错如下(可右滑):


There is insufficient memory
for the Java Runtime Environment to continue. Native memory allocation (malloc) failed to allocate 715784192 bytes for committing reserved memory.

出现上述错误常见原因有三:

1. 确实是机身内存不足以分配相应内存
2. 某进程启动线程过多,可能是代码哪里有问题
3. ulimit -n 过小


本文正是因为第一个原因。

解决办法:

    a. 停掉此节点无用的应用,释放更多内存。

    b. 调小Hbase Region Server 堆栈内存。如下:

Hbase 之 There is insufficient memory


    c. 重启即可。


4. 预警信息


当发现日志中出现如下提示信息(flush 延迟),一般是内存分配不足了,要早做处理(可右滑)。

2018-03-07 10:20:38,091 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,6x00x00x00x00x00x00x00,1499652593538.3e9884dcb01d4a98408919fbc3433f3f. because S has an old edit so flush to free WALs after random delay 33552ms
2018-03-07 10:20:38,091 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,xB4x00x00x00x00x00x00x00,1499652593538.f7538a5b9b88fa60857a9f38f272a436. because S has an old edit so flush to free WALs after random delay 242347ms
2018-03-07 10:20:38,091 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,{x00x00x00x00x00x00x00,1499652593538.fb9d550b69cc3409d5256718c47d871d. because S has an old edit so flush to free WALs after random delay 118059ms
2018-03-07 10:20:38,091 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,;x00x00x00x00x00x00x00,1499652593538.3c226c0465be2a64d7c0fa9e25e41a7e. because S has an old edit so flush to free WALs after random delay 135722ms
2018-03-07 10:20:38,092 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,xA8x00x00x00x00x00x00x00,1499652593538.90c4ebec39bb67680dbe4f17c662d229. because S has an old edit so flush to free WALs after random delay 287717ms
2018-03-07 10:20:38,092 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,<x00x00x00x00x00x00x00,1499652593538.2bd2b43e949edd374878fe6b31f79c15. because S has an old edit so flush to free WALs after random delay 280345ms
2018-03-07 10:20:38,092 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,x88x00x00x00x00x00x00x00,1499652593538.e00b1e90a58c0f8d78fe4727d87c8949. because S has an old edit so flush to free WALs after random delay 150252ms
2018-03-07 10:20:38,092 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: yq-hadoop****,60020,1520385422866-MemstoreFlusherChore requesting flush of TraceV2,xF2x00x00x00x00x00x00x00,1499652593538.d834224abc9d6b88b93911ea1fa17a0e. because S has an old edit so flush to free WALs after random delay 230071ms




















想对你说:“


      天越来越多,明天越来越少,这就叫人生。







            你之所以觉得时间一年比一年过得快,






       是因为时间对你一年比一年重要。


             别因为害怕孤单而凑合着相拥,



      也别因为一时的别无选择而将就的活着,




           总要有一段路,需要你独自走过。






      愿你是阳光,明媚不忧伤。”


















Hbase|Kylin|Hive|Impala|Spark|Phoenix ect.


虽没官方认证


将最好的祝福送给正在阅读的你,感恩!

以上是关于Hbase 之 There is insufficient memory的主要内容,如果未能解决你的问题,请参考以下文章

Struts2注解错误之There is no Action mapped for namespace /

mybatis之org.apache.ibatis.reflection.ReflectionException: There is no getter for property named '

HBase 报错系列之region is not online

Is there anybody in there?

Spirng 循环依赖报错:Is there an unresolvable circular reference?

Is there a difference between `==` and `is` in Python?