Recommended size for yarn.nodemanager.resource.local-dirs?
Posted felixzh
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Recommended size for yarn.nodemanager.resource.local-dirs?相关的知识,希望对你有一定的参考价值。
Folks,
What is the recommended value for "yarn.nodemanager.resource.local-dirs"?
We only have one value (directory) configured for the above property, which has a size of 200GB.
Our hive jobs‘ map/reduce fill this folder up, and yarn places this node in the blocklist. Moving to tez engine and/or increasing the quota size may fix this, but we‘d like to know the recommended value.
If you use the same partitions for yarn intermediate data than for the HDFS blocks, then you might also consider setting the fs.datanode.du.reserved property, which reserves some space on those partitions for non-hdfs use (such as intermediate yarn data).
One base recommendation I saw on my first Hadoop training long time ago was to dedicate 25% of the "data disks" for that kind of intermediate data. I guess the optimal answer should consider the maximum amount of intermediate data you can get at the same time (when launching a job, do you use all the data of HDFS as input data?) and dedicate the space for yarn.nodemanager.resource.local-dirs accordingly.
I would also recommend turning on the property mapreduce.map.output.compress in order to reduce the size of the intermediate data.
You would assign one folder to each of the datanode disks, closely mapping dfs.datanode.data.dir. On a 12 disk system you would have 12 yarn local-dir locations.
以上是关于Recommended size for yarn.nodemanager.resource.local-dirs?的主要内容,如果未能解决你的问题,请参考以下文章
webpack 打包报错:One CLI for webpack must be installed. These are recommended choices, delivered as sepa
deprecated core-js@2.6.12: core-js@<3 is no longer maintained and not recommended for usage due to t
Requirements for yarn quality in blank weaving