Hadoop的NullWritable

Posted akia开凯

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop的NullWritable相关的知识,希望对你有一定的参考价值。

1. Hadoop中的NullWritable

NullWritable是Writable的一个特殊类,实现方法为空实现,不从数据流中读数据,也不写入数据,只充当占位符,如在MapReduce中,如果你不需要使用键或值,你就可以将键或值声明为NullWritable,NullWritable是一个不可变的单实例类型。

比如,我设置map的输出为,这样做:

1 @Override
2 protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
3 String line = value.toString();
4 WebLogBean webLogBean = WebLogParser.parser(line);
5 WebLogParser.filtStaticResource(webLogBean, pages); // 过滤js/图片/css等静态资源
6 k.set(webLogBean.toString()); 
7 context.write(k, NullWritable.get(););
8 }

不能使用new NullWritable()来定义,获取空值只能NullWritable.get()来获取

来自https://www.cnblogs.com/Skyar/p/5815486.html

2. Yarn上的各种Id

  • jobId

描述:出自MapReduce,对作业的唯一标识。
格式:job_${clusterStartTime}_${jobid}
例子:job_1498552288473_2742

  • applicationId

描述:在yarn中对作业的唯一标识。
格式:application_${clusterStartTime}_${applicationId}
例子:application_1498552288473_2742

  • taskId

描述:作业中的任务的唯一标识
格式:task_${clusterStartTime}_${applicationId}_[m|r]_${taskId}
例子:task_1498552288473_2742_m_000000、task_1498552288473_2742_r_000000

  • attempId

描述:任务尝试执行的一次id
格式:attempt_${clusterStartTime}_${applicationId}_[m|r]_${taskId}_${attempId}
例子:attempt_1498552288473_2742_m_000000_0

  • appAttempId

描述:ApplicationMaster的尝试执行的一次id。
格式:appattempt_${clusterStartTime}_${applicationId}_${appAttempId}
例子:appattempt_1498552288473_2742_000001

  • containerId

描述:container的id
格式:container_e*epoch*_${clusterStartTime}_${applicationId}_${appAttempId}_${containerId}
例子:container_e20_1498552288473_2742_01_000032、container_1498552288473_2742_01_000032

参考:Yarn之日志分析

3. Yarn的运行日志存放目录以及备份目录

Yarn上任务的运行日志存在哪呢?你有没有疑问过,今天终于明白了,以Flink on yarn上的任务为例(其他任务类似,不一定完全相同),看JobManager.log的截图

 1 2019-08-05 09:51:55,826 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
 2 2019-08-05 09:51:55,827 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Starting YarnSessionClusterEntrypoint (Version: 1.7.1, Rev:<unknown>, Date:<unknown>)
 3 2019-08-05 09:51:55,827 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  OS current user: yarn
 4 2019-08-05 09:51:56,255 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Current Hadoop/Kerberos user: worker
 5 2019-08-05 09:51:56,255 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.65-b01
 6 2019-08-05 09:51:56,255 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Maximum heap size: 1388 MiBytes
 7 2019-08-05 09:51:56,255 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JAVA_HOME: /usr/local/jdk/
 8 2019-08-05 09:51:56,258 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Hadoop version: 2.6.0-cdh5.5.0
 9 2019-08-05 09:51:56,258 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM Options:
10 2019-08-05 09:51:56,258 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Xmx1448m
11 2019-08-05 09:51:56,258 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlog.file=/var/log/hadoop/container/application_1564969910131_0252/container_e05_1564969910131_0252_01_000001/jobmanager.log
12 2019-08-05 09:51:56,258 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlogback.configurationFile=file:logback.xml
13 2019-08-05 09:51:56,258 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlog4j.configuration=file:log4j.properties
14 2019-08-05 09:51:56,258 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Program Arguments: (none)
15 2019-08-05 09:51:56,259 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Classpath: lib/flink-python_2.11-1.7.1.jar:lib/flink-shaded-hadoop2-uber-1.7.1.jar:lib/log4j-1.2.17.jar:lib/slf4j-log4j12-1.7.15.jar:log4j.properties:logback.xml:flink.jar:flink-conf.yaml::/etc/hadoop/conf.cloudera.yarn4:/run/cloudera-scm-agent/process/47188-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-annotations.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-auth.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-aws.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-nfs.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-nfs-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common-2.6.0-cdh5.5.0-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-aws-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-auth-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-annotations-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-format.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-format-sources.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-format-javadoc.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-tools.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-thrift.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-test-hadoop2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-scrooge_2.10.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-scala_2.10.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-protobuf.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-pig.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-pig-bundle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-jackson.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-hadoop.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-hadoop-bundle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-generator.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-encoding.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-column.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-cascading.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-avro.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/hue-plugins-3.9.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-httpclient-3.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-math3-3.1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-digester-1.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/curator-recipes-2.7.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-net-3.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-configuration-1.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/httpclient-4.2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/gson-2.2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/httpcore-4.2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/curator-framework-2.7.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jersey-core-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/htrace-core4-4.0.1-incubating.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-lang-2.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jets3t-0.9.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/hamcrest-core-1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-logging-1.1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/apacheds-i18n-2.0.0-M15.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/zookeeper.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/avro.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/stax-api-1.0-2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/mockito-all-1.8.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jsr305-3.0.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jersey-server-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/java-xmlbuilder-0.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-io-2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/paranamer-2.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-compress-1.4.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-collections-3.2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/aws-java-sdk-1.7.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/api-util-1.0.0-M20.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/api-asn1-api-1.0.0-M20.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/slf4j-log4j12.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/xz-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/slf4j-api-1.7.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/netty-3.6.2.Final.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/logredactor-1.0.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/junit-4.11.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jsch-0.1.42.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jettison-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jersey-json-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/curator-client-2.7.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-nfs.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-nfs-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-2.6.0-cdh5.5.0-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/xml-apis-1.3.04.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/xercesImpl-2.9.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/netty-3.6.2.Final.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/leveldbjni-all-1.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jsr305-3.0.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jersey-server-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jersey-core-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/htrace-core4-4.0.1-incubating.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-logging-1.1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-lang-2.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-io-2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-daemon-1.0.13.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-api.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-client.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-registry.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-applicationhistoryservice.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-nodemanager.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-web-proxy.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-web-proxy-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-tests-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-nodemanager-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-common-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-applicationhistoryservice-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-registry-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-common-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-client-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-api-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/spark-yarn-shuffle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/xz-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/leveldbjni-all-1.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jsr305-3.0.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jline-2.11.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jettison-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-server-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-json-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-guice-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-core-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-client-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jaxb-impl-2.2.3-1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/javax.inject-1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/guice-servlet-3.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/guice-3.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-logging-1.1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-lang-2.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-io-2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-httpclient-3.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-compress-1.4.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-collections-3.2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/aopalliance-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/zookeeper.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/spark-1.5.0-cdh5.5.0-yarn-shuffle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/stax-api-1.0-2.jar
16 2019-08-05 09:51:56,260 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------

看见了吧,去JobManager那台机器上对应目录下去找,可以看到日志所在文件夹

1 总用量 16K
2 drwx--x---  3 yarn yarn   63 7月  25 17:58 application_1564048714696_0300
3 drwx--x---  3 yarn yarn   63 7月  28 13:39 application_1564110901389_0235
4 drwx--x---  4 yarn yarn  116 8月   5 09:52 application_1564969910131_0252
5 drwx--x---  3 yarn yarn   63 8月   5 20:35 application_1564998210570_0394

这个是运行时日志,如果任务停止了呢,Yarn一般会有备份的,比如我这边用的Cloudera,有个配置

然后你用

1 hadoop fs -ls /tmp/logs/用户名/logs

发现日志都在。 

4. Hadoop中的主要配置文件

core-site.xml:hadoop全局配置:HDFS路径 ,临时数据的公共目录,ZooKeeper集群的地址和端口。

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/data/hadoop/tmp</value>
        <!-- 其他临时目录的父目录 -->
    </property>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop-alone:9000</value>
        <!--
            hdfs://host:port/
            默认的文件系统的名称。通常指定namenode的URI地址,包括主机和端口
        -->
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>4096</value>
        <!--
            在序列文件中使用的缓冲区大小,这个缓冲区的大小应该是页大小(英特尔x86上为4096)的倍数
            他决定读写操作中缓冲了多少数据(单位kb)
        -->
    </property>
    <!--ZooKeeper集群的地址和端口。注意,数量一定是奇数,且不少于三个节点-->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181</value>
    </property>
</configuration>

hdfs-site.xml:HDFS文件系统属性:dfs.replication(文件块的数据备份数), dfs.data.dir(datanode节点存储在文件系统的目录) , dfs.name.dir(namenode节点存储在文件系统的目录) 。

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
        <!--指定dataNode存储block的副本数量,默认值是3个,该值应该不大于4-->
    </property>
    <property>
        <name>dfs.blocksize</name>
        <value>268435456</value>
        <!--大型的文件系统HDFS块大小为256MB,先默认是128MB-->
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file://${hadoop.tmp.dir}/dfs/name</value>
        <!--
            存放namenode的名称表(fsimage)的目录,如果这是一个逗号分隔的目录列表,
            那么在所有目录中复制名称表,用于冗余。
         -->
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file://${hadoop.tmp.dir}/dfs/data</value>
        <!--
            存放datanode块的目录。如果这是一个逗号分隔的目录列表,
            那么数据将存储在所有命名的目录中,通常存储在不同的设备上。
        -->
    </property>
    <property>
        <name>dfs.namenode.http-address</name>
        <value>0.0.0.0:50070</value>
        <!---->
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>0.0.0.0:50090</value>
        <!--secondary namenode HTTP服务器地址和端口。-->
    </property>
    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
    <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
        <!-- 当为true时,则允许HDFS的检测,当为false时,则关闭HDFS的检测,但不影响其它HDFS的其它功能。-->
    </property>
    <property>
        <name>dfs.nameservices</name>
        <value>hadoop-cluster1</value>
        <!--给hdfs集群起名字 -->
    </property>
    <property>
        <name>dfs.namenode.handler.count</name>
        <value>100</value>
        <!-- RPC服务器的监听client线程数,如果dfs.namenode.servicerpc-address属性没有配置,则线程会监听所有节点的请求。-->
    </property>
</configuration>

yarn-site.xml:设置Yarn的参数,NodeManager、ResourceManager等

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>hadoop-alone:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>hadoop-alone:8025</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>hadoop-alone:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>hadoop-alone:8050</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>hadoop-alone:8030</value>
    </property>
</configuration>

mapred-site.xml:指明MapReduce的运行框架为YARN;

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
        <!--执行框架设置为 Hadoop YARN.-->
    </property>
</configuration>

hadoop-env.sh:设置jdk的安装路径

5. 使用distcp并行拷贝hdfs文件

hadoop distcp -bandwidth 50 hdfs://namenode:xxxx/源地址 hdfs://namenode:xxxx/目标地址

 

以上是关于Hadoop的NullWritable的主要内容,如果未能解决你的问题,请参考以下文章

Hadoop Serialization -- hadoop序列化具体解释 Text,BytesWritable,NullWritable

Hadoop数据类型

hadoop 自定义OutputFormat

(转)Hadoop数据类型

MapReduce使用Partitioner分区案例

hadoop MapReduce自定义分区Partition输出各运营商的手机号码