YARN(含MR2)常用配置
Posted 小基基o_O
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了YARN(含MR2)常用配置相关的知识,希望对你有一定的参考价值。
文章目录
YARN架构图
YARN工作机制
调度器
资源调度器的类
yarn.resourcemanager.scheduler.class
-
原文:
- The class to use as the resource scheduler. 译文:
-
资源调度器的类
容量调度器是org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
公平调度器是org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
最大优先级
yarn.cluster.max-application-priority
-
原文
-
Defines maximum application priority in a cluster.
If an application is submitted with a priority higher than this value, it will be reset to this maximum value.
译文
- 定义集群中应用程序的最大优先级
处理调度器请求的线程数量
yarn.resourcemanager.scheduler.client.thread-count
-
原文
- Number of threads to handle scheduler interface. 译文
- 处理调度器接口的线程数
NodeManager
单节点NN可分配的物理内存
yarn.nodemanager.resource.memory-mb
-
原文:
-
Amount of physical memory, in MB, that can be allocated for containers.
If set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically calculated(in case of Windows and Linux).
In other cases, the default is 8192MB.
译文:
-
当前节点NodeManager可分配给容器们的物理内存量(以MB为单位)
如果设置为-1且yarn.nodemanager.resource.detect-hardware-capabilities
为true,就会自动计算
其它情况默认8192MB
例如有10个NM,每个NN配置内存50G,则总内存是500G
单节点NN可分配的虚拟核心数
yarn.nodemanager.resource.cpu-vcores
-
原文
-
Number of vcores that can be allocated for containers.
This is used by the RM scheduler when allocating resources for containers.
This is not used to limit the number of CPUs used by YARN containers.
If it is set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically determined from the hardware in case of Windows and Linux.
In other cases, number of vcores is 8 by default.
译文
-
当前节点NodeManager可分配给容器们的虚拟核心数
如果设置为-1且yarn.nodemanager.resource.detect-hardware-capabilities
是true,就依据硬件来自动确定虚拟核数
其它情况默认8
单节点预留给非YARN进程的物理内存总量
yarn.nodemanager.resource.system-reserved-memory-mb
-
原文:
-
Amount of physical memory, in MB, that is reserved for non-YARN processes.
This configuration is only used ifyarn.nodemanager.resource.detect-hardware-capabilities
is set to true andyarn.nodemanager.resource.memory-mb
is -1.
If set to -1, this amount is calculated as 20% of (system memory - 2*HADOOP_HEAPSIZE
)
译文:
-
预留给非YARN进程的物理内存总量,单位为MB
当yarn.nodemanager.resource.detect-hardware-capabilities
为true且yarn.nodemanager.resource.memory-mb
为-1时生效
如果设置为-1, 计 算 值 = ( 系 统 内 存 − 2 × H A D O O P _ H E A P S I Z E ) × 20 % 计算值=(系统内存 - 2 \\times HADOOP\\_HEAPSIZE)\\times 20 \\% 计算值=(系统内存−2×HADOOP_HEAPSIZE)×20%
每个容器可分配的最小内存
yarn.scheduler.minimum-allocation-mb
-
原文:
-
The minimum allocation for every container request at the RM in MBs.
Memory requests lower than this will be set to the value of this property.
Additionally, a node manager that is configured to have less memory than this value will be shut down by the resource manager.
译文:
-
每个容器(向RM请求)可分配的最小内存,单位MB
低于此值的内存请求将被设置为此属性的值
此外,内存小于此值的NM节点将被RM关闭
每个容器可分配的最大内存
yarn.scheduler.maximum-allocation-mb
-
原文:
-
The maximum allocation for every container request at the RM in MBs.
Memory requests higher than this will throw an InvalidResourceRequestException.
译文:
-
每个容器(向RM请求)可分配的最大内存,单位MB
高于此值的内存请求将抛出InvalidResourceRequestException
每个容器可分配的最少虚拟核数
yarn.scheduler.minimum-allocation-vcores
-
原文:
-
The minimum allocation for every container request at the RM in terms of virtual CPU cores.
Requests lower than this will be set to the value of this property.
Additionally, a node manager that is configured to have fewer virtual cores than this value will be shut down by the resource manager.
译文:
-
每个容器(向RM请求)可分配的最少虚拟核数
低于此值的请求将被设置为此属性的值
此外,虚拟核心数少于此值的NM节点将被RM关闭
每个容器可分配的最多虚拟核数
yarn.scheduler.maximum-allocation-vcores
-
原文:
-
The maximum allocation for every container request at the RM in terms of virtual CPU cores.
Requests higher than this will throw an InvalidResourceRequestException.
译文:
-
每个容器(向RM请求)可分配的最多虚拟核数
高于此值的请求将抛出InvalidResourceRequestException
自动检测节点资源
yarn.nodemanager.resource.detect-hardware-capabilities
-
原文:
- Enable auto-detection of node capabilities such as memory and CPU. 译文:
-
是否启用自动检测节点资源(内存、CPU…)
false为禁用,true为启用
虚拟核数和物理核数的转换乘数
yarn.nodemanager.resource.pcores-vcores-multiplier
-
译文:
-
Multiplier to determine how to convert phyiscal cores to vcores.
This value is used ifyarn.nodemanager.resource.cpu-vcores
is set to -1(which implies auto-calculate vcores) andyarn.nodemanager.resource.detect-hardware-capabilities
is set to true.
The number of vcores will be calculated as number of CPUs * multiplier.
译文:
-
虚拟核数和物理核数的转换乘数
当yarn.nodemanager.resource.cpu-vcores
为-1且yarn.nodemanager.resource.detect-hardware-capabilities
为true时,此值生效
虚 拟 核 心 总 数 = 物 理 核 心 总 数 × 转 换 乘 数 虚拟核心总数=物理核心总数 \\times 转换乘数 虚拟核心总数=物理核心总数×转换乘数
例如:4核8线程,该参数就设为2
MapReduce
每个Map任务的虚拟核心数
mapreduce.map.cpu.vcores
-
原文:
- The number of virtual cores to request from the scheduler for each map task.
每个Map任务的内存
mapreduce.map.memory.mb
-
原文:
-
The amount of memory to request from the scheduler for each map task.
If this is not specified or is non-positive, it is inferred frommapreduce.map.java.opts
andmapreduce.job.heap.memory-mb.ratio
.
If java-opts are also not specified, we set it to 1024.
译文
-
每个向调度器请求的Map任务的内存,单位MB
如果冇指定,就根据mapreduce.map.java.opts
和mapreduce.job.heap.memory-mb.ratio
来推断
如果mapreduce.map.java.opts
也没指定,就1024MB
单个Map任务内存应小于单个容器可分配的最大内存
单个Map任务内存应小于单节点NN可分配的物理内存的最大值
每个Reduce任务的虚拟核心数
mapreduce.reduce.cpu.vcores
-
原文:
- The number of virtual cores to request from the scheduler for each reduce task.
每个Reduce任务的内存
mapreduce.reduce.memory.mb
-
原文:
-
The amount of memory to request from the scheduler for each reduce task.
If this is not specified or is non-positive, it is inferred frommapreduce.reduce.java.opts
andmapreduce.job.heap.memory-mb.ratio
.
If java-opts are also not specified, we set it to 1024.
译文
-
每个向调度器请求的Reduce任务的内存,单位MB
如果冇指定,就根据mapreduce.reduce.java.opts
和mapreduce.job.heap.memory-mb.ratio
来推断
如果mapreduce.reduce.java.opts
也没指定,就1024MB
单个Reduce任务内存应小于单个容器可分配的最大内存
单个Reduce任务内存应小于单节点NN可分配的物理内存的最大值
堆大小与容器大小的比率
mapreduce.job.heap.memory-mb.ratio
-
原文:
-
The ratio of heap-size to container-size.
If no -Xmx is specified, it is calculated as (mapreduce.map|reduce.memory.mb
*mapreduce.heap.memory-mb.ratio
).
If -Xmx is specified but notmapreduce.map|reduce.memory.mb
, it is calculated as (heapSize /mapreduce.heap.memory-mb.ratio
).
译文
-
堆大小与容器大小的比率
当冇指定-Xmx
时:堆大小=mapreduce.map|reduce.memory.mb
× \\times ×mapreduce.heap.memory-mb.ratio
当指定了-Xmx
,但冇指定mapreduce.map|reduce.memory. mb
时:mapreduce.map|reduce.memory.mb
=堆大小/mapreduce.heap.memory-mb.ratio
MR的ApplicationMaster所需的内存大小
yarn.app.mapreduce.am.resource.mb
-
原文:
- The amount of memory the MR AppMaster needs.
MR的ApplicationMaster所需的虚拟核心数
yarn.app.mapreduce.am.resource.cpu-vcores
-
原文:
- The number of virtual CPU cores the MR AppMaster needs.
Appendix
en | 🔉 | cn |
---|---|---|
invalid | ɪnˈvælɪd | 作废的;不能识别的; |
infer | ɪnˈfɜːr | v. 推断 |
specify | ˈspesɪfaɪ | v. 明确指出;具体说明 |
convert | kənˈvɜːrt | v. (使)转换 |
multiplier | ˈmʌltɪplaɪər | n. [数] 乘数;[电子] 倍增器;增加者;繁殖者 |
imply | ɪmˈplaɪ | v. 暗示;意味着;必然包含 |
-Xms | 初始Java堆内存大小 | |
-Xmx | 最大Java堆内存大小 |
以上是关于YARN(含MR2)常用配置的主要内容,如果未能解决你的问题,请参考以下文章