idea项目远程管理Hadoop集群分布式文件系统
Posted -starrysky-
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了idea项目远程管理Hadoop集群分布式文件系统相关的知识,希望对你有一定的参考价值。
1.windows配置环境变量hadoop,winutils.exe和hadoop.dll下载链接:https://share.weiyun.com/5eebZFr
2.demo:
(1)pom.xml:
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>wang.june</groupId> <artifactId>HadoopDemo</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.9.2</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs</artifactId> <version>2.9.2</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.9.2</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>RELEASE</version> </dependency> </dependencies> </project>
(2)HDFS_CRUD.java:
package wang.june; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.*; import org.apache.log4j.BasicConfigurator; import org.junit.Before; import org.junit.Test; import java.io.IOException; public class HDFS_CRUD { FileSystem fileSystem = null; //初始化客户端对象 @Before public void init() throws IOException { //构造一个配置参数对象,设置一个参数:要访问的HDFS的URI Configuration configuration = new Configuration(); //这里指定使用的是HDFS configuration.set("fs.defaultFS", "hdfs://hadoop01:9000"); //通过如下的方式进行客户端身份的设置 System.setProperty("HADOOP_USER_NAME", "root"); //通过FileSystem的静态方法获取文件系统客户端对象 fileSystem = FileSystem.get(configuration); } //上传文件到HDFS @Test public void testAddFileToHdfs() throws IOException { //要上传的文件所在本地路径 Path src = new Path("E:/user/picture/钟文/钟美女.jpg"); //要上传到HDFS的目标路径 Path dst = new Path("/testJune"); //上传文件方法 fileSystem.copyFromLocalFile(src, dst); fileSystem.close(); } //从HDFS下载文件到本地 @Test public void testDownloadFileToLocal() throws IOException { fileSystem.copyToLocalFile(new Path("/testJune"), new Path("D:/")); fileSystem.close(); } //目录操作 @Test public void testMKdirAndDeleteAndRename() throws IOException { //创建目录 fileSystem.mkdirs(new Path("/a/b/c")); fileSystem.mkdirs(new Path("/a2/b2/c2")); //重命名文件或文件夹 fileSystem.rename(new Path("/a"), new Path("/a3")); //删除文件夹,如果是非空文件夹,参数2必须给值true fileSystem.delete(new Path("/a2"), true); } //查看目录中的文件信息 @Test public void testListFiles() throws IOException { //获取迭代器对象 RemoteIterator<LocatedFileStatus> listFiles = fileSystem.listFiles(new Path("/"), true); while (listFiles.hasNext()) { LocatedFileStatus fileStatus = listFiles.next(); //打印当前文件名 System.out.println(fileStatus.getPath().getName()); //打印当前文件块大小 System.out.println(fileStatus.getBlockSize()); //打印当前文件权限 System.out.println(fileStatus.getPermission()); //打印当前文件内容长度 System.out.println(fileStatus.getLen()); //获取该文件块信息(包含长度,数据块,datanode的信息) BlockLocation[] blockLocations = fileStatus.getBlockLocations(); for (BlockLocation blockLocation : blockLocations) { System.out.println("block-length: " + blockLocation.getLength() + " -- " + "block-offset: " + blockLocation.getOffset()); String[] hosts = blockLocation.getHosts(); for (String host : hosts) { System.out.println(host); } } System.out.println("--------------------分割线--------------------"); } } }
以上是关于idea项目远程管理Hadoop集群分布式文件系统的主要内容,如果未能解决你的问题,请参考以下文章
本地idea开发mapreduce程序提交到远程hadoop集群执行