Flink配置Yarn日志聚合配置历史日志。

Posted 大宁哥

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Flink配置Yarn日志聚合配置历史日志。相关的知识,希望对你有一定的参考价值。

Flink配置Yarn日志聚合、配置历史日志

对于已经结束的yarn应用,flink进程已经退出无法提供webui服务。所以需要通过JobHistoryServer查看保留在yarn上的日志。
下面就给大家分享一下我在配置方面的经历吧。

1.yarn配置聚合日志

编辑:yarn-site.xml

说明: 开启后任务执行 “完毕” 后,才会上传日志至hdfs

查询:yarn logs -applicationId application_1546250639760_0055

配置

<!--
        配置20220402-开启日志聚合-开始
        说明:开启后任务执行 “完毕” 后,才会上传日志至hdfs
        查询命令:yarn logs -applicationId application_1546250639760_0055
    -->
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>10080</value>
        <description>日志存储时间</description>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
        <description>是否启用日志聚集功能</description>
    </property>
    <property>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/yarn</value>
        <description>当应用程序运行结束后,日志被转移到的HDFS目录(启用日志聚集功能时有效),如此便可通过appmaster UI查看作业的运行日志。</description>
    </property>
    <property>
        <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
        <value>logs</value>
        <description>远程日志目录子目录名称(启用日志聚集功能时有效)</description>
    </property>
    <!-- 配置20220402-开启日志聚合-结束 -->
实验1:hadoop自带的wordcount实验。
#词频统计

#1.先vim创建一个文件,里面随便写点东西
#2.put到hafs上
#2.执行命令
hadoop jar  \\
/usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount \\
/test1 \\
/test2/o2 

现象:正常。

1.配置之前:运行 yarn log xxxxxxxx 看不到运行成长产生的日志。

1.1运行:

[hdfs@bigdata1 hadoop]$ hadoop jar  /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapred                                                                                                                uce-examples-3.0.0.jar wordcount /test1 /test2/o3
2022-04-02 01:33:47,691 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803                                                                                                                2
2022-04-02 01:33:48,229 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hado                                                                                                                op-yarn/staging/hdfs/.staging/job_1648877577075_0001
2022-04-02 01:33:48,445 INFO input.FileInputFormat: Total input files to process : 1
2022-04-02 01:33:48,519 INFO mapreduce.JobSubmitter: number of splits:1
2022-04-02 01:33:48,556 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.en                                                                                                                abled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2022-04-02 01:33:48,659 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648877577075_0001
2022-04-02 01:33:48,661 INFO mapreduce.JobSubmitter: Executing with tokens: []
2022-04-02 01:33:48,843 INFO conf.Configuration: resource-types.xml not found
2022-04-02 01:33:48,844 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2022-04-02 01:33:49,261 INFO impl.YarnClientImpl: Submitted application application_1648877577075_0001
2022-04-02 01:33:49,300 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/applica                                                                                                                tion_1648877577075_0001/
2022-04-02 01:33:49,300 INFO mapreduce.Job: Running job: job_1648877577075_0001
2022-04-02 01:33:56,416 INFO mapreduce.Job: Job job_1648877577075_0001 running in uber mode : false
2022-04-02 01:33:56,417 INFO mapreduce.Job:  map 0% reduce 0%
2022-04-02 01:34:02,490 INFO mapreduce.Job:  map 100% reduce 0%
2022-04-02 01:34:08,529 INFO mapreduce.Job:  map 100% reduce 100%
2022-04-02 01:34:08,540 INFO mapreduce.Job: Job job_1648877577075_0001 completed successfully
2022-04-02 01:34:08,633 INFO mapreduce.Job: Counters: 53
        File System Counters
                FILE: Number of bytes read=1843
                FILE: Number of bytes written=417739
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2071
                HDFS: Number of bytes written=1386
                HDFS: Number of read operations=8
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Rack-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=13196
                Total time spent by all reduces in occupied slots (ms)=28888
                Total time spent by all map tasks (ms)=3299
                Total time spent by all reduce tasks (ms)=3611
                Total vcore-milliseconds taken by all map tasks=3299
                Total vcore-milliseconds taken by all reduce tasks=3611
                Total megabyte-milliseconds taken by all map tasks=13512704
                Total megabyte-milliseconds taken by all reduce tasks=29581312
        Map-Reduce Framework
                Map input records=50
                Map output records=167
                Map output bytes=2346
                Map output materialized bytes=1843
                Input split bytes=97
                Combine input records=167
                Combine output records=113
                Reduce input groups=113
                Reduce shuffle bytes=1843
                Reduce input records=113
                Reduce output records=113
                Spilled Records=226
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=100
                CPU time spent (ms)=1440
                Physical memory (bytes) snapshot=557318144
                Virtual memory (bytes) snapshot=13850431488
                Total committed heap usage (bytes)=390070272
                Peak Map Physical memory (bytes)=331268096
                Peak Map Virtual memory (bytes)=5254959104
                Peak Reduce Physical memory (bytes)=226050048
                Peak Reduce Virtual memory (bytes)=8595472384
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1974
        File Output Format Counters
                Bytes Written=1386

1.2执行查询命令:

[hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0001
2022-04-02 01:40:46,605 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803                                                                                                                2
File /yarn/hdfs/logs/application_1648877577075_0001 does not exist.

Can not find any log file matching the pattern: [ALL] for the application: application_1648877577075_000                                                                                                                1
Can not find the logs for the application: application_1648877577075_0001 with the appOwner: hdfs
[hdfs@bigdata1 hadoop]$ yarn logs -applicationId application_1648877577075_0002
2022-04-02 01:40:57,983 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:803                                                                                                                2
Unable to get ApplicationState. Attempting to fetch logs directly from the filesystem.
Can not find the appOwner. Please specify the correct appOwner
Could not locate application logs for application_1648877577075_0002

2.配置之后:可以看到完整的运行日志

2.1运行:

[hdfs@bigdata1 hadoop]$ hadoop jar  /usr/local/BigData/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount /test1 /test2/a1
2022-04-02 02:25:09,179 INFO client.RMProxy: Connecting to ResourceManager at bigdata1/192.168.72.31:8032
2022-04-02 02:25:09,718 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hdfs/.staging/job_1648879195625_0002
2022-04-02 02:25:09,936 INFO input.FileInputFormat: Total input files to process : 1
2022-04-02 02:25:10,009 INFO mapreduce.JobSubmitter: number of splits:1
2022-04-02 02:25:10,043 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2022-04-02 02:25:10,144 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648879195625_0002
2022-04-02 02:25:10,145 INFO mapreduce.JobSubmitter: Executing with tokens: []
2022-04-02 02:25:10,325 INFO conf.Configuration: resource-types.xml not found
2022-04-02 02:25:10,325 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2022-04-02 02:25:10,380 INFO impl.YarnClientImpl: Submitted application application_1648879195625_0002
2022-04-02 02:25:10,417 INFO mapreduce.Job: The url to track the job: http://bigdata1:8088/proxy/application_1648879195625_0002/
2022-04-02 02:25:10,417 INFO mapreduce.Job: Running job: job_1648879195625_0002
2022-04-02 02:25:17,508 INFO mapreduce.Job: Job job_1648879195625_0002 running in uber mode : false
2022-04-02 02:25:17,509 INFO mapreduce.Job:  map 0% reduce 0%
2022-04-02 02:25:23,575 INFO mapreduce.Job:  map 100% reduce 0%
2022-04-02 02:25:28,607 INFO mapreduce.Job:  map 100% reduce 100%
2022-04-02 02:25:28,616 INFO mapreduce.Job: Job job_1648879195625_0002 completed successfully
2022-04-02 02:25:28,707 INFO mapreduce.Job: Counters: 53
        File System Counters
                FILE: Number of bytes read=1843
                FILE: Number of bytes written=417711
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2071
                HDFS: Number of bytes written=1386
                HDFS: Number of read operations=8
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=12876
                Total time spent by all reduces in occupied slots (ms)=21016
                Total time spent by all map tasks (ms)=3219
                Total time spent by all reduce tasks (ms)=2627
                Total vcore-milliseconds taken by all map tasks=3219
                Total vcore-milliseconds taken by all reduce tasks=2627
                Total megabyte-milliseconds taken by all map tasks=13185024
                Total megabyte-milliseconds taken by all reduce tasks=21520384
        Map-Reduce Framework
                Map input records=50
                Map output records=167
                Map output bytes=2346
                Map output materialized bytes=1843
                Input split bytes=97
                Combine input records=167
                Combine output records=113
                Reduce input groups=113
                Reduce shuffle bytes=1843
                Reduce input records=113
                Reduce output records=113
                Spilled Records=226
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=97
                CPU time spent (ms)=1360
                Physical memory (bytes) snapshot=552706048
                Virtual memory (bytes) snapshot=13832036352
                Total committed heap usage (bytes)=391643136
                Peak Map Physical memory (bytes)=329871360
                Peak Map Virtual memory (bytes)=5243228160
                Peak Reduce Physical memory (bytes)=222834688
                Peak Reduce Virtual memory (bytes)=8588808192
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1974
        File Output Format Counters
                Bytes Written=1386

2.2执行查询命令

..................
非常多完整日志。

实验总结:

1)配置完就可以用 yarn logs -applicationId application_xxxxxxx 看到丰富的日志内容。
2)但是8088 web中的logs没法查看了,why?
	因为:没有配置历史服务,看下一节
实验2:flink on yarn 模式运行的时候,日志。
实验内容:
	flink job 消费kafka, 并且print()输出。
job配置:
	flink并行度:1	

首先启动job。 可以发现job运行的容器在cm2上。

接下来可以看到 该job的一些日志,包括jm的和tm的。 tm的日志就是Stdout中看到的,并且刷新可以看到文件size在变大。

再根据我们在yarn 上配置的容器日志位置可以看到日志保存在什么位置。下图可见是保存在各个节点的/yarn/container-logs下。

我们到cm2的/yarn/container-logs 下看看。

这个和 flink控制台显示的一模一样。

当程序执行完毕或者cancel之后。容器日志被自动删除。为此有如下疑问1( 但是聚合日志在hdfs上正常有。)

疑问1:如下配置不是设置日志在容器所在节点下的保存时间吗?

疑问2:如果job是多个并行度怎么办?
如果是多个并行度。 那么‘总日志’ = 多个容器的日志合并。 也就是日志聚合的结果。
2.2 cdh上配置聚合日志(默认配置好的)



#其中yarn.nodemanager.log-dirs 表示每个nodemanager上的容器产生的日志保存地址。
	但是日志聚合会把同一个job分散的日志进行聚合到hdfs.

2.yarn配置历史日志

学习
https://blog.csdn.net/qq_38038143/article/details/88641288

https://blog.csdn.net/qq_35440040/article/details/84233655?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~aggregatepage~first_rank_ecpm_v1~rank_v31_ecpm-1-84233655.pc_agg_new_rank&utm_term=yarn%E5%8E%86%E5%8F%B2%E6%97%A5%E5%BF%97&spm=1000.2123.3001.4430

https://blog.csdn.net/duyenson/article/details/118994693

https://www.cnblogs.com/zwgblog/p/6079361.html

配置1:mapred-site.xml

<property>
	<name>mapreduce.jobhistory.address</name>
	<value>master:10020</value>
</property>
<property>
	<name>mapreduce.jobhistory.webapp.address</name>
	<value>master:19888</value>
</property>

配置2:yarn-site.xml

	<!--Spark Yarn-->
    <!-- 是否开启聚合日志 -->
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <!-- 配置日志服务器的地址,work节点使用 -->
    <property>
        <name>yarn.log.server.url</name>
        <value>http://master:19888/jobhistory/logs/</value>
    </property>
    <!-- 配置日志过期时间,单位秒 -->
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>86400</value>
    </property>

分发

启动: mr-jobhistory-daemon.sh start historyserver

jps

查看日志:在8088端口点 id , 进去点log.

3.yarn配置历史日志plus->timelineservice

上聊天截图中大佬给的,他是hadoop 3.13,我当时是3.0.0 没配置成功。 以后再继续研究吧

yarn-site

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>bigdata1</value>
    </property>

    <property>
	    <name>yarn.resourcemanager.webapp.address</name>
	    <value>bigdata1:8088</value>
    </property>

    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>512</value>
   </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>4096</value>
    </property>
    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>30720</value>
    </property>
	<property>
        <name>yarn.scheduler.minimum-allocation-vcores</name>
        <value>1</value>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-vcores</name>
        <value>4</value>
    </property>
    <property>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>12</value>
    </property> 
    <property>
        <name>yarn.nodemanager.pmem-check-enabled</name>
        <value>false</value>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>86400</value>
    </property>

    <property>
        <name>yarn.timeline-service.enabled</name>
        <value>true</value>
    </property>

    <property>
        <name>yarn.timeline-service.hostname</name>
        <value>$yarn.resourcemanager.hostname</value>
    </property>

   <property>
        <name>yarn.timeline-service.address</name>
        <value>$yarn.timeline-service.hostname:10020</value>
    </property>

    <property>
        <name>yarn.timeline-service.webapp.address</name>
        Yarn application开启日志聚合,并配置存储路径和周期

0747-5.16.2-YARN日志聚合目录说明

Hive产生大量Info日志的问题(由Flink On Yarn配置引起的)

Spark2x on yarn日志配置详解

Hadoop完全分布式配置 历史服务器&日志聚集

Hadoop3.x搭建详细教程 | 历史服务器的配置与日志聚合