编译Hadoop

Posted One-Way

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了编译Hadoop相关的知识,希望对你有一定的参考价值。

Apache Hadoop 生态圈软件下载地址:http://archive.apache.org/dist/hadoop/
hadoop下载地址 http://archive.apache.org/dist/hadoop/common

 

安装环境 虚拟机中的 CentOS 64位

 

需要安装软件:

jdk: jdk-7u45-linux-x64.rpm
hadoop:hadoop-2.7.2-src.tar.gz
maven:apache-maven-3.0.5-bin.tar.gz
protobuf:protobuf-2.5.0.tar.gz
 
在hadoop目录下有个BUILDING.txt文件,编译源码需要准备的软件,及编译方法进行了说明,可参考使用
 
Requirements:

* Unix System
* JDK 1.7+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
* Zlib devel (if compiling native code)
* openssl devel ( if compiling native hadoop-pipes and to get the best HDFS encryption performance )
* Jansson C XML parsing library ( if compiling libwebhdfs )
* Linux FUSE (Filesystem in Userspace) version 2.6 or above ( if compiling fuse_dfs )
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)

 

1.下载Hadoop

wget  http://apache.opencas.org/hadoop/common/hadoop-2.7.2/hadoop-2.7.2-src.tar.gz
tar -zxvf  hadoop-2.7.2-src.tar.gz
 

2.安装JDK

sudo yum install jdk-7u45-linux-x64.rpm 
查看jdk安装位置:
 
which java
/usr/java/jdk1.7.0_45/bin/java 
添加jdk到环境变量(~/.bash_profile):
 
export JAVA_HOME=/usr/java/jdk1.7.0_45
export PATH=.:$JAVA_HOME/bin:$PATH 
验证:
 
java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) 

 

3、安装maven

wgethttp://apache.fayea.com/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz
tar -xzvf apache-maven-3.0.5-bin.tar.gz 
 
添加maven到环境变量(~/.bash_profile):
export MAVEN_HOME=/home/hadoop/app/apache-maven-3.0.5
export PATH=.:$MAVEN_HOME/bin:$PATH 
 
验证:
mvn -version
Apache Maven 3.0.5 (r01de14724cdef164cd33c7c8c2fe155faf9602da; 2013-02-19 05:51:28-0800)
Maven home: /home/hadoop/app/apache-maven-3.0.5
Java version: 1.7.0_45, vendor: Oracle Corporation
Java home: /usr/java/jdk1.7.0_45/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-358.el6.x86_64", arch: "amd64", family: "unix" 
 
如果设置代理,需要修改maven配置文件。
<proxy>
      <id>optional</id>
      <active>true</active>
      <protocol>http</protocol>
      <host>x.x.x.x</host>
      <port>8080</port>
</proxy>
 
下载失败可以换镜像
<mirror> 
  <id>CN</id> 
  <name>OSChina Central</name>                                                                                    
  <url>http://maven.oschina.net/content/groups/public/</url> 
  <mirrorOf>central</mirrorOf> 
</mirror>

 

4、安装protobuf

 
protobuf的官方地址貌似上不了,自行下载protobuf安装包;为了编译安装protobuf,需要先gcc/gcc-c++/make
 
sudo apt-get install gcc
sudo apt-get install c++
sudo apt-get install cmake
sudo apt-get install gcc-c++
 
tar -zvxf protobuf-2.5.0.tar.gz 
cd protobuf-2.5.0
./configure --prefix=/usr/local/protoc/ 
sudo make
sudo make install 
添加protobuf到环境变量(~/.bash_profile):
 
export PATH=.:/usr/local/protoc/bin:$PATH 
验证:
 
protoc --version
libprotoc 2.5.0
 

5、安装其他依赖

sudo yum install cmake
sudo yum install openssl-devel
sudo yum install ncurses-devel 

 

6、编译hadoop源代码

cd ~/app/hadoop-2.7.2-src 
mvn package -DskipTests -Pdist,native 
 
编译后的代码在hadoop-2.7.2-src/hadoop-dist/target/hadoop-2.7.2下
 

问题:

一、编译过程中下载jar失败。
java.net.UnknownHostException: archive.apache.org
解决方式:
1.手动下载放到如下目录:
hadoop-common-project\hadoop-kms\downloads
hadoop-hdfs-project\hadoop-hdfs-httpfs\downloads
2.删除配置文件中的下载步骤:
hadoop-common-project\hadoop-kms\target\antrun\build-main.xml
<mkdir dir="downloads"/>
  <get dest="downloads/apache-tomcat-6.0.41.tar.gz" skipexisting="true" verbose="true" src="http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.41/bin/apache-tomcat-6.0.41.tar.gz"/>
 
二、Maven编译出现“java.lang.OutOfMemoryError: Java heap space”
解决方式:
.profile 中添加
export MAVEN_OPTS=‘-Xms256m -Xmx1024m‘  
 
三、编译出现 分配内存空间失败
扩大虚拟机内存
 
四、CMake Error     Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the
sudo apt-get install libssl-dev

以上是关于编译Hadoop的主要内容,如果未能解决你的问题,请参考以下文章

Hadoop之Linux源代码编译

导致资产预编译在heroku部署上失败的代码片段

如何有条件地将 C 代码片段编译到我的 Perl 模块?

编译Hadoop源码

Hadoop2.x介绍与源代码编译

hadoop 编译自己的jar包并运行