大数据之---Yarn伪分布式部署和MapReduce案例

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了大数据之---Yarn伪分布式部署和MapReduce案例相关的知识,希望对你有一定的参考价值。

1、软件环境

RHEL6 角色 jdk-8u45
hadoop-2.8.1.tar.gz ? ssh
xx.xx.xx.xx ip地址 NN hadoop01
xx.xx.xx.xx ip地址 DN hadoop02
xx.xx.xx.xx ip地址 DN hadoop03
xx.xx.xx.xx ip地址 DN hadoop04
xx.xx.xx.xx ip地址 DN hadoop05

本次涉及伪分布式部署只是要主机hadoop01,软件安装参考伪分布式部署终极篇

2、配置yarn和mapreduce

?

[[email protected] hadoop]$ cp mapred-site.xml.template mapred-site.xml

配置yarn
[[email protected] hadoop]$ vi mapred-site.xml
<configuration>
? ? <property>
??????? <name>mapreduce.framework.name</name>
??????? <value>yarn</value>
??? </property>

</configuration>

配置mapreduce
[[email protected] hadoop]$ vi yarn-site.xml:
<configuration>
??? <property>
??????? <name>yarn.nodemanager.aux-services</name>
??????? <value>mapreduce_shuffle</value>
??? </property>

</configuration>

?

?

?

3、提交测试jar计算圆周率

job_1524804813835_0001 job命名格式: job_unix时间_数字

技术分享图片

[[email protected] sbin]$ ./start-yarn.sh

[[email protected] hadoop]$ find ./* -name *examples*
./lib/native/examples
./share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.8.1-sources.jar
./share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.8.1-test-sources.jar
./share/hadoop/mapreduce/lib-examples
./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar
./share/doc/hadoop/hadoop-auth-examples
./share/doc/hadoop/hadoop-mapreduce-examples
./share/doc/hadoop/api/org/apache/hadoop/examples
./share/doc/hadoop/api/org/apache/hadoop/security/authentication/examples
[[email protected] hadoop]$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 5 10
Number of Maps? = 5
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Starting Job
18/04/27 12:58:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/04/27 12:58:50 INFO input.FileInputFormat: Total input files to process : 5
18/04/27 12:58:50 INFO mapreduce.JobSubmitter: number of splits:5
18/04/27 12:58:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524804813835_0001
18/04/27 12:58:51 INFO impl.YarnClientImpl: Submitted application application_1524804813835_0001
18/04/27 12:58:51 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1524804813835_0001/
18/04/27 12:58:51 INFO mapreduce.Job: Running job: job_1524804813835_0001
18/04/27 12:59:03 INFO mapreduce.Job: Job job_1524804813835_0001 running in uber mode : false
18/04/27 12:59:03 INFO mapreduce.Job:? map 0% reduce 0%
18/04/27 12:59:18 INFO mapreduce.Job:? map 100% reduce 0%
18/04/27 12:59:25 INFO mapreduce.Job:? map 100% reduce 100%
18/04/27 12:59:26 INFO mapreduce.Job: Job job_1524804813835_0001 completed successfully
18/04/27 12:59:27 INFO mapreduce.Job: Counters: 49
??? File System Counters
??????? FILE: Number of bytes read=116
??????? FILE: Number of bytes written=819783
??????? FILE: Number of read operations=0
??????? FILE: Number of large read operations=0
??????? FILE: Number of write operations=0
??????? HDFS: Number of bytes read=1350
??????? HDFS: Number of bytes written=215
??????? HDFS: Number of read operations=23
??????? HDFS: Number of large read operations=0
??????? HDFS: Number of write operations=3
??? Job Counters
??????? Launched map tasks=5
??????? Launched reduce tasks=1
??????? Data-local map tasks=5
??????? Total time spent by all maps in occupied slots (ms)=64938
??????? Total time spent by all reduces in occupied slots (ms)=4704
??????? Total time spent by all map tasks (ms)=64938
??????? Total time spent by all reduce tasks (ms)=4704
??????? Total vcore-milliseconds taken by all map tasks=64938
??????? Total vcore-milliseconds taken by all reduce tasks=4704
??????? Total megabyte-milliseconds taken by all map tasks=66496512
??????? Total megabyte-milliseconds taken by all reduce tasks=4816896
??? Map-Reduce Framework
??????? Map input records=5
??????? Map output records=10
??????? Map output bytes=90
??????? Map output materialized bytes=140
??????? Input split bytes=760
??????? Combine input records=0
??????? Combine output records=0
??????? Reduce input groups=2
??????? Reduce shuffle bytes=140
??????? Reduce input records=10
??????? Reduce output records=0
??????? Spilled Records=20
??????? Shuffled Maps =5
??????? Failed Shuffles=0
??????? Merged Map outputs=5
??????? GC time elapsed (ms)=1428
??????? CPU time spent (ms)=5740
??????? Physical memory (bytes) snapshot=1536856064
??????? Virtual memory (bytes) snapshot=12578734080
??????? Total committed heap usage (bytes)=1152385024
??? Shuffle Errors
??????? BAD_ID=0
??????? CONNECTION=0
??????? IO_ERROR=0
??????? WRONG_LENGTH=0
??????? WRONG_MAP=0
??????? WRONG_REDUCE=0
??? File Input Format Counters
??????? Bytes Read=590
??? File Output Format Counters
??????? Bytes Written=97
Job Finished in 37.717 seconds
Estimated value of Pi is 3.28000000000000000000
[[email protected] hadoop]$

以上是关于大数据之---Yarn伪分布式部署和MapReduce案例的主要内容,如果未能解决你的问题,请参考以下文章

大数据之伪分布式部署之终极篇

Hadoop学习系列(2.Hadoop框架介绍与搜索技术体系介绍)

Spark On YARN 分布式集群安装

大数据之---hadoop伪分布式部署(HDFS)全网终极篇

大数据之---hadoop伪分布式部署(HDFS)全网终极篇

Linux企业运维——Hadoop大数据平台(上)Hadoop工作原理部署资源管理器Yarn