1.快速入门

当前存储集群的DN的空间占用率很不均衡，最大的使用率接近100%，最小的使用率不到35%。

为了平衡空间的占用率，我们在CDH上开启了“重新平衡”。

调用的脚本实际如下：

hdfs/hdfs.sh ["balancer","-threshold","10.0","-policy","DataNode”]

查看当前的进度条：

Successfully moved blk_1255414776_181709174 with size=134217728 from 172.16.16.66:50010:DISK to 172.16.16.39:50010:DISK through 172.16.16.219:50010

重新平衡并没有将空间占用率最高的DN优先执行。

2.命令行优化办法

查看hdfs balancer的命令如下：

[[email protected] ~]# hdfs balancer -help

Usage: java Balancer

[-policy <policy>] the balancing policy: datanode or blockpool

[-threshold <threshold>] Percentage of disk capacity

[-exclude [-f <hosts-file> | comma-sperated list of hosts]] Excludes the specified datanodes.

[-include [-f <hosts-file> | comma-sperated list of hosts]] Includes only the specified datanodes.

为了更高效率的执行balancer操作，建议如下：

-threshold 30 设置越大，越快结束，并且优先解决DN占用率高的

参数含义：判断集群是否平衡的目标参数，每一个 datanode 存储使用率和集群总存储使用率的差值都应该小于这个阀值，理论上，该参数设置的越小，整个集群就越平衡，但是在线上环境中，hadoop集群在进行balance时，还在并发的进行数据的写入和删除，所以有可能无法到达设定的平衡参数值。

-include 包含如下的DN列表

dfs.balance.bandwidthPerSec 300MB（我们计算集群的设置）

参数含义：设置balance工具在运行中所能占用的带宽，设置的过大可能会造成mapred运行缓慢。

执行命令如下：

hdfs balancer -policy datanode -threshold 30 -include -f /tmp/hdfs-blancer.txt

在CDH中，balancer是通过如下实例实现的。

优化一：Balancer阈值越高，需要平衡的量越少，DN占用率不够均衡；阈值越低，需要平衡的量越大， DN占有率越均衡；

优化二：增大Balancer的Java堆大小

优化三：高级配置：hdfs-site.xml 高级配置代码段（安全阀）

#在DataNode和balancer实例都需要配置

<name>dfs.datanode.balance.max.concurrent.moves</name>

</property>

#在balancer实例配置

<name>dfs.balancer.moverThreads</name>

</property>

<name>dfs.balancer.dispatcherThreads</name>

</property>

<name>dfs.balancer.max-size-to-move</name>

</property>