Hadoop2.6运行wordcount
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop2.6运行wordcount相关的知识,希望对你有一定的参考价值。
Hadoop2.6运行wordcount
1、启动hadoop
[[email protected] hadoop-2.6.0]$ ./sbin/start-all.sh
[[email protected] hadoop-2.6.0]$ jps
21444 ResourceManager
21301 SecondaryNameNode
22072 Jps
21117 NameNode
[[email protected] current]$ jps
5505 NodeManager
5397 DataNode
6102 Jps
2、在hadoop的目录下创建一个file文件夹(哪里其实无所谓,导入到input就行)
[[email protected] ~]$ mkdir file
[[email protected] ~]$ cd file
在file文件夹中创建两个子文件,并输入内容:
[[email protected] file]$ echo "Hello World" > file1.txt
[[email protected] file]$ echo "Hello World" > file2.txt
[[email protected] file]$ ls
file1.txt file2.txt
[[email protected] file]$ cat file1.txt
Hello World
[[email protected] file]$ cat file2.txt
Hello World
3、在HDFS上创建输入文件夹目录 input
[[email protected] hadoop-2.6.0]$ bin/hadoop fs -mkdir /input
[[email protected] hadoop-2.6.0]$ hadoop fs -ls /
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2016-02-28 15:51 /input
[[email protected] hadoop-2.6.0]$ bin/hadoop fs -put ~/file/file
file1.txt file2.txt
4、把本地文件传到hdfs的/input中
[[email protected] hadoop-2.6.0]$ bin/hadoop fs -put ~/file/file* /input
[[email protected] hadoop-2.6.0]$ bin/hadoop fs -ls /input
Found 2 items
-rw-r--r-- 2 hadoop supergroup 12 2016-02-28 15:55 /input/file1.txt
-rw-r--r-- 2 hadoop supergroup 12 2016-02-28 15:55 /input/file2.txt
5、运行wordcount程序(使用hadoop自带运行wordcount的jar包)
[[email protected] hadoop-2.6.0]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input/ /output/wordcount1
16/02/28 15:58:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/02/28 15:58:16 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.101.230:8032
16/02/28 15:58:17 INFO input.FileInputFormat: Total input paths to process : 2
16/02/28 15:58:17 INFO mapreduce.JobSubmitter: number of splits:2
16/02/28 15:58:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1456645810248_0001
16/02/28 15:58:19 INFO impl.YarnClientImpl: Submitted application application_1456645810248_0001
16/02/28 15:58:19 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1456645810248_0001/
16/02/28 15:58:19 INFO mapreduce.Job: Running job: job_1456645810248_0001
16/02/28 15:58:32 INFO mapreduce.Job: Job job_1456645810248_0001 running in uber mode : false
16/02/28 15:58:32 INFO mapreduce.Job: map 0% reduce 0%
16/02/28 15:58:43 INFO mapreduce.Job: map 100% reduce 0%
16/02/28 15:58:56 INFO mapreduce.Job: map 100% reduce 100%
16/02/28 15:58:56 INFO mapreduce.Job: Job job_1456645810248_0001 completed successfully
16/02/28 15:58:56 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=54
FILE: Number of bytes written=317807
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=222
HDFS: Number of bytes written=16
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=19118
Total time spent by all reduces in occupied slots (ms)=8889
Total time spent by all map tasks (ms)=19118
Total time spent by all reduce tasks (ms)=8889
Total vcore-seconds taken by all map tasks=19118
Total vcore-seconds taken by all reduce tasks=8889
Total megabyte-seconds taken by all map tasks=19576832
Total megabyte-seconds taken by all reduce tasks=9102336
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=40
Map output materialized bytes=60
Input split bytes=198
Combine input records=4
Combine output records=4
Reduce input groups=2
Reduce shuffle bytes=60
Reduce input records=4
Reduce output records=2
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=394
CPU time spent (ms)=3450
Physical memory (bytes) snapshot=368005120
Virtual memory (bytes) snapshot=959819776
Total committed heap usage (bytes)=247578624
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=24
File Output Format Counters
Bytes Written=16
6、查看输出结果,计数成功
[[email protected] hadoop-2.6.0]$ bin/hdfs dfs -cat /output/wordcount1/*
16/02/28 16:00:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Hello 2
World 2
同时可以在web页面上查看wordcount运行的结果
本文出自 “梅花香自苦寒来!” 博客,请务必保留此出处http://daixuan.blog.51cto.com/5426657/1745781
以上是关于Hadoop2.6运行wordcount的主要内容,如果未能解决你的问题,请参考以下文章
使用命令行编译打包运行自己的MapReduce程序 Hadoop2.6.0
[0012] Hadoop 版hello word mapreduce wordcount 运行
[0004] Hadoop 版hello word mapreduce wordcount 运行
编写Spark的WordCount程序并提交到集群运行[含scala和java两个版本]