Flink配置Yarn日志聚合配置历史日志。
Posted 大宁哥
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Flink配置Yarn日志聚合配置历史日志。相关的知识,希望对你有一定的参考价值。
Flink配置Yarn日志聚合、配置历史日志
对于已经结束的yarn应用,flink进程已经退出无法提供webui服务。所以需要通过JobHistoryServer查看保留在yarn上的日志。
下面就给大家分享一下我在配置方面的经历吧。
1.yarn配置聚合日志
编辑:yarn-site.xml
说明: 开启后任务执行 “完毕” 后,才会上传日志至hdfs
查询:yarn logs -applicationId application_1546250639760_0055
配置:
<!--
配置20220402-开启日志聚合-开始
说明:开启后任务执行 “完毕” 后,才会上传日志至hdfs
查询命令:yarn logs -applicationId application_1546250639760_0055
-->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>10080</value>
<description>日志存储时间</description>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<description>是否启用日志聚集功能</description>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/yarn</value>
<description>当应用程序运行结束后,日志被转移到的HDFS目录(启用日志聚集功能时有效),如此便可通过appmaster UI查看作业的运行日志。</description>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
<description>远程日志目录子目录名称(启用日志聚集功能时有效)</description>
</property>
<!-- 配置20220402-开启日志聚合-结束 -->
实验1:hadoop自带的wordcount实验。
#词频统计
#1.先vim创建一个文件,里面随便写点东西
#2.put到hafs上
#2.执行命令
hadoop jar \\
/usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount \\
/test1 \\
/test2/o2
现象:正常。
1.配置之前:运行 yarn log xxxxxxxx 看不到运行成长产生的日志。
1.1运行:
[hdfs@bigdata1 hadoop]$ hadoop jar /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapred uce-examples-3.0.0.jar wordcount /test1 /test2/o3
2022-04-02 01:33:47,691 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803 2
2022-04-02 01:33:48,229 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hado op-yarn/staging/hdfs/.staging/job_1648877577075_0001
2022-04-02 01:33:48,445 INFO input.FileInputFormat: Total input files to process : 1
2022-04-02 01:33:48,519 INFO mapreduce.JobSubmitter: number of splits:1
2022-04-02 01:33:48,556 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.en abled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2022-04-02 01:33:48,659 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648877577075_0001
2022-04-02 01:33:48,661 INFO mapreduce.JobSubmitter: Executing with tokens: []
2022-04-02 01:33:48,843 INFO conf.Configuration: resource-types.xml not found
2022-04-02 01:33:48,844 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2022-04-02 01:33:49,261 INFO impl.YarnClientImpl: Submitted application application_1648877577075_0001
2022-04-02 01:33:49,300 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/applica tion_1648877577075_0001/
2022-04-02 01:33:49,300 INFO mapreduce.Job: Running job: job_1648877577075_0001
2022-04-02 01:33:56,416 INFO mapreduce.Job: Job job_1648877577075_0001 running in uber mode : false
2022-04-02 01:33:56,417 INFO mapreduce.Job: map 0% reduce 0%
2022-04-02 01:34:02,490 INFO mapreduce.Job: map 100% reduce 0%
2022-04-02 01:34:08,529 INFO mapreduce.Job: map 100% reduce 100%
2022-04-02 01:34:08,540 INFO mapreduce.Job: Job job_1648877577075_0001 completed successfully
2022-04-02 01:34:08,633 INFO mapreduce.Job: Counters: 53
File System Counters
FILE: Number of bytes read=1843
FILE: Number of bytes written=417739
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2071
HDFS: Number of bytes written=1386
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Rack-local map tasks=1
Total time spent by all maps in occupied slots (ms)=13196
Total time spent by all reduces in occupied slots (ms)=28888
Total time spent by all map tasks (ms)=3299
Total time spent by all reduce tasks (ms)=3611
Total vcore-milliseconds taken by all map tasks=3299
Total vcore-milliseconds taken by all reduce tasks=3611
Total megabyte-milliseconds taken by all map tasks=13512704
Total megabyte-milliseconds taken by all reduce tasks=29581312
Map-Reduce Framework
Map input records=50
Map output records=167
Map output bytes=2346
Map output materialized bytes=1843
Input split bytes=97
Combine input records=167
Combine output records=113
Reduce input groups=113
Reduce shuffle bytes=1843
Reduce input records=113
Reduce output records=113
Spilled Records=226
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=100
CPU time spent (ms)=1440
Physical memory (bytes) snapshot=557318144
Virtual memory (bytes) snapshot=13850431488
Total committed heap usage (bytes)=390070272
Peak Map Physical memory (bytes)=331268096
Peak Map Virtual memory (bytes)=5254959104
Peak Reduce Physical memory (bytes)=226050048
Peak Reduce Virtual memory (bytes)=8595472384
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1974
File Output Format Counters
Bytes Written=1386
1.2执行查询命令:
[hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0001
2022-04-02 01:40:46,605 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803 2
File /yarn/hdfs/logs/application_1648877577075_0001 does not exist.
Can not find any log file matching the pattern: [ALL] for the application: application_1648877577075_000 1
Can not find the logs for the application: application_1648877577075_0001 with the appOwner: hdfs
[hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0002
2022-04-02 01:40:57,983 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803 2
Unable to get ApplicationState. Attempting to fetch logs directly from the filesystem.
Can not find the appOwner. Please specify the correct appOwner
Could not locate application logs for application_1648877577075_0002
2.配置之后:可以看到完整的运行日志
2.1运行:
[hdfs@bigdata1 hadoop]$ hadoop jar /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount /test1 /test2/a1
2022-04-02 02:25:09,179 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:8032
2022-04-02 02:25:09,718 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hdfs/.staging/job_1648879195625_0002
2022-04-02 02:25:09,936 INFO input.FileInputFormat: Total input files to process : 1
2022-04-02 02:25:10,009 INFO mapreduce.JobSubmitter: number of splits:1
2022-04-02 02:25:10,043 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2022-04-02 02:25:10,144 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648879195625_0002
2022-04-02 02:25:10,145 INFO mapreduce.JobSubmitter: Executing with tokens: []
2022-04-02 02:25:10,325 INFO conf.Configuration: resource-types.xml not found
2022-04-02 02:25:10,325 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2022-04-02 02:25:10,380 INFO impl.YarnClientImpl: Submitted application application_1648879195625_0002
2022-04-02 02:25:10,417 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/application_1648879195625_0002/
2022-04-02 02:25:10,417 INFO mapreduce.Job: Running job: job_1648879195625_0002
2022-04-02 02:25:17,508 INFO mapreduce.Job: Job job_1648879195625_0002 running in uber mode : false
2022-04-02 02:25:17,509 INFO mapreduce.Job: map 0% reduce 0%
2022-04-02 02:25:23,575 INFO mapreduce.Job: map 100% reduce 0%
2022-04-02 02:25:28,607 INFO mapreduce.Job: map 100% reduce 100%
2022-04-02 02:25:28,616 INFO mapreduce.Job: Job job_1648879195625_0002 completed successfully
2022-04-02 02:25:28,707 INFO mapreduce.Job: Counters: 53
File System Counters
FILE: Number of bytes read=1843
FILE: Number of bytes written=417711
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2071
HDFS: Number of bytes written=1386
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=12876
Total time spent by all reduces in occupied slots (ms)=21016
Total time spent by all map tasks (ms)=3219
Total time spent by all reduce tasks (ms)=2627
Total vcore-milliseconds taken by all map tasks=3219
Total vcore-milliseconds taken by all reduce tasks=2627
Total megabyte-milliseconds taken by all map tasks=13185024
Total megabyte-milliseconds taken by all reduce tasks=21520384
Map-Reduce Framework
Map input records=50
Map output records=167
Map output bytes=2346
Map output materialized bytes=1843
Input split bytes=97
Combine input records=167
Combine output records=113
Reduce input groups=113
Reduce shuffle bytes=1843
Reduce input records=113
Reduce output records=113
Spilled Records=226
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=97
CPU time spent (ms)=1360
Physical memory (bytes) snapshot=552706048
Virtual memory (bytes) snapshot=13832036352
Total committed heap usage (bytes)=391643136
Peak Map Physical memory (bytes)=329871360
Peak Map Virtual memory (bytes)=5243228160
Peak Reduce Physical memory (bytes)=222834688
Peak Reduce Virtual memory (bytes)=8588808192
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1974
File Output Format Counters
Bytes Written=1386
2.2执行查询命令
..................
非常多完整日志。
实验总结:
1)配置完就可以用 yarn logs -applicationId application_xxxxxxx 看到丰富的日志内容。
2)但是8088 web中的logs没法查看了,why?
因为:没有配置历史服务,看下一节
实验2:flink on yarn 模式运行的时候,日志。
实验内容:
flink job 消费kafka, 并且print()输出。
job配置:
flink并行度:1
首先启动job。 可以发现job运行的容器在cm2上。
接下来可以看到 该job的一些日志,包括jm的和tm的。 tm的日志就是Stdout中看到的,并且刷新可以看到文件size在变大。
再根据我们在yarn 上配置的容器日志位置可以看到日志保存在什么位置。下图可见是保存在各个节点的/yarn/container-logs下。
我们到cm2的/yarn/container-logs 下看看。
这个和 flink控制台显示的一模一样。
当程序执行完毕或者cancel之后。容器日志被自动删除。为此有如下疑问1( 但是聚合日志在hdfs上正常有。)
疑问1:如下配置不是设置日志在容器所在节点下的保存时间吗?
疑问2:如果job是多个并行度怎么办?
如果是多个并行度。 那么‘总日志’ = 多个容器的日志合并。 也就是日志聚合的结果。
2.2 cdh上配置聚合日志(默认配置好的)
#其中yarn.nodemanager.log-dirs 表示每个nodemanager上的容器产生的日志保存地址。
但是日志聚合会把同一个job分散的日志进行聚合到hdfs.
2.yarn配置历史日志
学习
https://blog.csdn.net/qq_38038143/article/details/88641288
https://blog.csdn.net/qq_35440040/article/details/84233655?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~aggregatepage~first_rank_ecpm_v1~rank_v31_ecpm-1-84233655.pc_agg_new_rank&utm_term=yarn%E5%8E%86%E5%8F%B2%E6%97%A5%E5%BF%97&spm=1000.2123.3001.4430
https://blog.csdn.net/duyenson/article/details/118994693
https://www.cnblogs.com/zwgblog/p/6079361.html
配置1:mapred-site.xml
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
配置2:yarn-site.xml
<!--Spark Yarn-->
<!-- 是否开启聚合日志 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 配置日志服务器的地址,work节点使用 -->
<property>
<name>yarn.log.server.url</name>
<value>http://master:19888/jobhistory/logs/</value>
</property>
<!-- 配置日志过期时间,单位秒 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>86400</value>
</property>
分发
启动: mr-jobhistory-daemon.sh start historyserver
jps
查看日志:在8088端口点 id , 进去点log.
3.yarn配置历史日志plus->timelineservice
上聊天截图中大佬给的,他是hadoop 3.13,我当时是3.0.0 没配置成功。 以后再继续研究吧
yarn-site
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>bigdata1</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>bigdata1:8088</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>512</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>30720</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>4</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>12</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>86400</value>
</property>
<property>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.hostname</name>
<value>$yarn.resourcemanager.hostname</value>
</property>
<property>
<name>yarn.timeline-service.address</name>
<value>$yarn.timeline-service.hostname:10020</value>
</property>
<property>
<name>yarn.timeline-service.webapp.address</name>
Yarn application开启日志聚合,并配置存储路径和周期