MapReduce编程入门及HDFS-JAVA接口

Posted xingweikun

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MapReduce编程入门及HDFS-JAVA接口相关的知识,希望对你有一定的参考价值。

使用Eclipse创建MapReduce工程

配置环境

推荐这篇博客
Eclipse连接Hadoop集群(详细版)
配置好后,我的是这个样子

保持虚拟机开启,虚拟机最好已经创建HDFS目录(非虚拟机本地目录)。

新建MapReduce工程

File->New->Project
选择Map/Reduce Project

直接Finish
新建一个Hello类测试环境是否正常

分布式文件系统HDFS

HDFS-JAVA接口之读取文件

/user/dfstest/hello_hadoop.txt
文件内容为hello hadoop
package FileTest;

import java.io.IOException;
import java.io.InputStream;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;

public class FileSystemCat 

	public static void main(String[] args) throws IOException 
		Configuration config = new Configuration();
		URI uri = URI.create("hdfs://master:8020/user/dfstest/hello_hadoop.txt");
		FileSystem fs = FileSystem.get(uri, config);
		InputStream in = null;
		try 
			in = fs.open(new Path(uri));
			IOUtils.copyBytes(in, System.out, 2048, false);
		 catch (Exception e) 
			IOUtils.closeStream(in);
		
	


如出现以上报错

如果你用的是Eclipse:
eclipse 程序-》右击-》Run as-》Run configuation-》Arguments-》VM Arguments。框内粘贴-》apply-》run OK

添加内容如下:

--illegal-access=deny --add-opens java.base/java.lang=ALL-UNNAMED


如果你用的是Idea:
idea->run->edit configurations->JVM options
如果发现里面没JVM options(刚看了一下我的里面就没有),看我下面如何添加


--illegal-access=deny --add-opens java.base/java.lang=ALL-UNNAMED

再次运行程序
这样看起来是不是舒服多了,红色报错消失了.

HDFS-JAVA接口之上传文件

package FileTest;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class FileSystemUpload 

	public static void main(String[] args) throws IOException
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://master:8020/");
		FileSystem fs = FileSystem.get(conf);
		// 待上传文件的存放路径,我从win10环境向centos7虚拟机传文件
		// 不同操作系统注意更改路径格式
		Path localPath = new Path("C:\\\\Users\\\\dell\\\\Desktop\\\\email_log.txt");
		// 这个是上传后存放的路径
		Path hdfsPath = new Path("/user/dfstest/ysc.txt");
		System.out.println("开始上传");
		fs.copyFromLocalFile(localPath, hdfsPath);
		System.out.println("上传完毕");
	


如果报错,可能是权限不够

hdfs dfs -chmod 777 /user/dfstest

HDFS-JAVA接口之删除文件

package FileTest;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class FileSystemDelete 

	public static void main(String[] args) throws IOException 
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://master:8020/");
		FileSystem fs = FileSystem.get(conf);
		// 删除文件
		Path path = new Path("/user/dfstest/ysc.txt");
		System.out.println("开始删除");
		fs.delete(path, true);
		System.out.println("删除完毕");
		fs.close();
	


HDFS-JAVA接口之列举文件夹和文件

列举文件夹

package FileTest;

import java.io.IOException;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;

public class FileSystemListwjj 

	public static void main(String[] args) throws IOException 
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://master:8020/");
		FileSystem fs = FileSystem.get(conf);
		Path path = new Path("/movie");
		FileStatus[] fileStatuses = fs.listStatus(path);
		for (FileStatus file : fileStatuses) 
			// 判断是否是文件夹
			if (file.isDirectory()) 
				System.out.println(file.getPath().toString());
			
		
		fs.close();
	



列举文件

package FileTest;

import java.io.IOException;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;

public class FileSystemListwj 

	public static void main(String[] args) throws IOException 
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://master:8020/");
		FileSystem fs = FileSystem.get(conf);
		Path path = new Path("/movie");
		FileStatus[] fileStatuses = fs.listStatus(path);
		for (FileStatus file : fileStatuses) 
			// 判断是否是文件
			if (file.isFile()) 
				System.out.println(file.getPath().toString());
			
		
		fs.close();
	



HDFS-JAVA接口之创建目录

package FileTest;

import java.io.IOException;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;

public class FileSystemMakedir 

	public static void main(String[] args)throws IOException 
		Configuration conf=new Configuration();
		conf.set("fs.defaultFS", "hdfs://master:8020");
		FileSystem fs=FileSystem.get(conf);
		Path path=new Path("/user/root/loginmessage");
		fs.mkdirs(path);
		fs.close();
	



HDFS-JAVA接口之下载文件

package FileTest;

import java.io.IOException;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;

public class FileSystemGet 

	public static void main(String[] args) throws IOException 
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://master:8020");
		FileSystem fs = FileSystem.get(conf);
		Path fromPath = new Path("/user/dfstest/hello_hadoop.txt");
		// 下载到桌面吧
		Path toPath = new Path("C:\\\\Users\\\\dell\\\\Desktop");
		System.out.println("开始下载:"+fromPath);
		fs.copyToLocalFile(false, fromPath, toPath, true);
		System.out.println(fromPath+"下载完毕");
	


HDFS-JAVA接口之写入文件

package FileTest;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;

public class FileSystemWrite 

	public static void main(String[] args) throws IOException 
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://master:8020");
		FileSystem fs = FileSystem.get(conf);
		//将path内容写入newPath
		Path path = new Path("/user/dfstest/hello_hadoop.txt");
		Path newPath = new Path("/user/dfstest/new_hello_hadoop.txt");
		fs.delete(newPath, true);
		FSDataOutputStream os = fs.create(newPath);
		FSDataInputStream is = fs.open(path);
		BufferedReader br = new BufferedReader(new InputStreamReader(is, "utf-8"));
		BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(os, "utf-8"));
		String line = "";
		while ((line = br.readLine()) != null) 
			bw.write(line);
			bw.newLine();
		
		bw.close();
		os.close();
		br.close();
		is.close();
		fs.close();
	


以上是关于MapReduce编程入门及HDFS-JAVA接口的主要内容,如果未能解决你的问题,请参考以下文章

一文快速入门大数据计算框架MapReduce

学习笔记Hadoop(十四)—— MapReduce开发入门—— MapReduce API介绍MapReduce实例

MapReduce入门编程-成绩求和排序

大数据IMF-L38-MapReduce内幕解密听课笔记及总结

hadoop入门笔记MapReduce简介

Hadoop MapReduce编程 API入门系列之wordcount版本5