SpringBoot集成Hadoop
Posted 程序员超时空
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了SpringBoot集成Hadoop相关的知识,希望对你有一定的参考价值。
SpringBoot集成Hadoop,相关配置过程如下。默认在Linux下已经装好Hadoop集群(Hadoop-2.8.5)。
一、集成HDFS
1、主要application.properties配置
#hdfs
hdfs.url=hdfs://192.168.2.5:9000
hdfs.username=root
hdfs.replication=2
hdfs.blocksize=67108864
2、主要pom.xml配置
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.8.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.8.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.8.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.44</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.springfox/springfox-swagger2 -->
<dependency>
<groupId>io.springfox</groupId>
<artifactId>springfox-swagger2</artifactId>
<version>2.9.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.springfox/springfox-swagger-ui -->
<dependency>
<groupId>io.springfox</groupId>
<artifactId>springfox-swagger-ui</artifactId>
<version>2.9.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.0.1-jre</version>
</dependency>
</dependencies>
完成其他相关配置和代码。
3、Windows下环境变量设置
在Windows下启动程序,请求HDFS报以下错误:
java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
Windows下Java集成HADOOP,需要用到winutils.exe。提示缺少HADOOP_HOME或hadoop.home.dir相关配置。从https://github.com/steveloughran/winutils下载相关库,与本集群hadoop-2.8.5最接近的库为hadoop-2.8.3,下载hadoop-2.8.3到本地磁盘。通过设置Windows环境变量或在程序中设置hadoop.home.dir解决。
(1)设置Windows环境变量
设置Windows环境变量HADOOP_HOME=D:softhadoopwinutilshadoop-2.8.3
(2)程序中设置hadoop.home.dir
在程序初始化Hadoop Configuration之前添加以下代码设置环境变量
//Windows设置hadoop.home.dir
System.setProperty("hadoop.home.dir","D:\\soft\\hadoop\\winutils\\hadoop-2.8.3");
4、Windows主机映射
需要配置本机的hosts文件,添加hadoop集群的主机映射,配置内容和hadoop集群的配置差不多。编辑C:WindowsSystem32driversetchosts文件,添加:
# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
# 102.54.94.97 rhino.acme.com # source server
# 38.25.63.10 x.acme.com # x client host
# localhost name resolution is handled within DNS itself.
# 127.0.0.1 localhost
# ::1 localhost
192.168.2.5 hadoop.master
以上是关于SpringBoot集成Hadoop的主要内容,如果未能解决你的问题,请参考以下文章
Elasticsearch:Hadoop 大数据集成 (Hadoop => Elasticsearch)