Ambari2.6.2 HDP2.6.5 大数据集群搭建
Posted SRE实战
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Ambari2.6.2 HDP2.6.5 大数据集群搭建相关的知识,希望对你有一定的参考价值。
Ambari 2.6.2 中
HDFS-2.7.3 YARN-2.7.3 HIVE-1.2.1 HBASE-1.1.2 ZOOKEEPER-3.4.6 SPARK-2.3.0
注:本文基于root用户操作
一、安装环境准备
操作系统 centos7.5
hdc-data1:192.168.163.51
hdc-data2:192.168.163.52
hdc-data3:192.168.163.53
【安装环境准备每台集群机器一样的操作,或者使用scp远程拷贝】
1、主机名IP映射配置
FQDN:(Fully Qualified Domain Name)全限定域名:同时带有主机名和域名的名称。(通过符号“.”)
例如:主机名是bigserver,域名是mycompany.com,那么FQDN就是bigserver.mycompany.com。
vi /etc/hosts
#添加如下地址映射及FQDN(ambari注册时需要)
192.168.163.51 hdc-data1 hdc-data1.hadoop
192.168.163.52 hdc-data2 hdc-data2.hadoop
192.168.163.53 hdc-data3 hdc-data3.hadoop
2、SSH免密登录配置
ssh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
3、关闭及禁止防火墙
systemctl stop firewalld.service
systemctl disable firewalld.service
4、关闭SELinux
SELinux(Security-Enhanced Linux) 是美国国家安全局(NSA)对于强制访问控制的实现,是 Linux历史上最杰出的新安全子系统。NSA是在Linux社区的帮助下开发了一种访问控制体系,在这种访问控制体系的限制下,进程只能访问那些在他的任务中所需要文件。
vi /etc/sysconfig/selinux
#修改以下内容
SELINUX=disabled
5、开启NTP服务
yum install -y ntp
systemctl enable ntpd
systemctl start ntpd
6、安装JDK
下载地址:https://www.oracle.com/technetwork/cn/java/javase/downloads/jdk8-downloads-2133151-zhs.html
通过文件传输上传到服务器,这里不可用wget直接下载。
mkdir -p /opt/java
tar -zxvf jdk-8u181-linux-x64.tar.gz -C /opt/java/
vi /etc/profile
export JAVA_HOME=/opt/java/jdk1.8.0_181
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin
远程分发到其他服务器
scp -r /opt/java/jdk1.8.0_181/ root@hdc-data2:/opt/java/
scp -r /opt/java/jdk1.8.0_181/ root@hdc-data3:/opt/java/
scp /etc/profile root@hdc-data2:/etc/
scp /etc/profile root@hdc-data3:/etc/
source /etc/profile
二、Ambari安装
1、制作Ambari本地yum源
选取一台机器即可,在此选择hdc-data1。
需要下载资源安装包如下:
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.2.0/ambari-2.6.2.0-centos7.tar.gz
wget http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.5.0/HDP-2.6.5.0-centos7-rpm.tar.gz
wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7/HDP-UTILS-1.1.0.22-centos7.tar.gz
wget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.5.0/HDP-GPL-2.6.5.0-centos7-gpl.tar.gz
1.1 安装 Apache HTTP 服务器
yum install httpd -y
#开启服务并设置开机自启动
systemctl start httpd.service
systemctl enable httpd.service
1.2 安装制作yum源工具
yum install yum-utils createrepo
1.3 创建http服务器
http根目录默认是即/var/www/html/
mkdir -p /var/www/html/ambari
#cd /var/www/html/ambari
#将下载好的安装包上传并解压
tar xvf HDP-2.6.5.0-centos7-rpm.tar.gz -C /var/www/html/ambari
tar xvf ambari-2.6.2.0-centos7.tar.gz -C /var/www/html/ambari
tar xvf HDP-UTILS-1.1.0.22-centos7.tar.gz -C /var/www/html/ambari
tar xvf HDP-GPL-2.6.5.0-centos7-gpl.tar.gz -C /var/www/html/ambari
# 删除压缩包
rm -rf ambari-2.6.2.0-centos7.tar.gz
rm -rf HDP-2.6.5.0-centos7-rpm.tar.gz
rm -rf HDP-UTILS-1.1.0.22-centos7.tar.gz
rm -rf HDP-GPL-2.6.5.0-centos7-gpl.tar.gz
验证
1.4 配置ambari、HDP、HDP-UTILS的本地源
#yum install wget -y
cd /etc/yum.repos.d/
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.2.0/ambari.repo
wget http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.5.0/hdp.repo
wget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.5.0/hdp.gpl.repo
编辑ambari.repo,修改baseurl和gpgkey
[[email protected] yum.repos.d]# vi ambari.repo
#VERSION_NUMBER=2.6.2.0-155
[ambari-2.6.2.0]
name=ambari Version - ambari-2.6.2.0
#baseurl=http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.2.0
baseurl=http://192.168.163.51/ambari/ambari/centos7/2.6.2.0-155
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.2.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.163.51/ambari/ambari/centos7/2.6.2.0-155/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
编辑hdp.repo,修改baseurl和gpgkey
[[email protected] yum.repos.d]# vi hdp.repo
#VERSION_NUMBER=2.6.5.0-292
[HDP-2.6.5.0]
name=HDP Version - HDP-2.6.5.0
#baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.5.0
baseurl=http://192.168.163.51/ambari/HDP/centos7/2.6.5.0-292
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.5.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.163.51/ambari/HDP/centos7/2.6.5.0-292/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-UTILS-1.1.0.22]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.22
#baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7
baseurl=http://192.168.163.51/ambari/HDP-UTILS/centos7/1.1.0.22
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.5.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.163.51/ambari/HDP-UTILS/centos7/1.1.0.22/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
编辑hdp.gpl.repo,修改baseurl和gpgkey
[[email protected] yum.repos.d]# vi hdp.gpl.repo
#VERSION_NUMBER=2.6.5.0-292
[HDP-GPL-2.6.5.0]
name=HDP-GPL Version - HDP-GPL-2.6.5.0
#baseurl=http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.5.0
baseurl=http://192.168.163.51/ambari/HDP-GPL/centos7/2.6.5.0-292
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/HDP-GPL/centos7/2.x/updates/2.6.5.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.163.51/ambari/HDP-GPL/centos7/2.6.5.0-292/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
分发到其他机器
scp /etc/yum.repos.d/ambari.repo root@hdc-data2:/etc/yum.repos.d/
scp /etc/yum.repos.d/ambari.repo root@hdc-data3:/etc/yum.repos.d/
scp /etc/yum.repos.d/hdp.repo root@hdc-data2:/etc/yum.repos.d/
scp /etc/yum.repos.d/hdp.repo root@hdc-data3:/etc/yum.repos.d/
scp /etc/yum.repos.d/hdp.gpl.repo root@hdc-data2:/etc/yum.repos.d/
scp /etc/yum.repos.d/hdp.gpl.repo root@hdc-data3:/etc/yum.repos.d/
每台机器yum配置
yum clean all
yum makecache
yum list
2、安装mysql数据库
Ambari安装会将安装等信息写入数据库,建议使用MariaDB数据库,也可以不安装而使用默认数据库PostgreSQL
2.1 安装及初始化设置
[[email protected] ~]# yum install mariadb-server
[[email protected] ~]# systemctl start mariadb
[[email protected] ~]# systemctl enable mariadb
[[email protected] ~]# mysql_secure_installation
#首先是设置密码,会提示先输入密码
Enter current password for root (enter for none):<–初次运行直接回车
#设置密码
Set root password? [Y/n] <– 是否设置root用户密码,输入y并回车或直接回车
New password: <– 设置root用户的密码
Re-enter new password: <– 再输入一次你设置的密码
#其他配置
Remove anonymous users? [Y/n] <– 是否删除匿名用户,回车
Disallow root login remotely? [Y/n] <–是否禁止root远程登录,回车,
Remove test database and access to it? [Y/n] <– 是否删除test数据库,回车
Reload privilege tables now? [Y/n] <– 是否重新加载权限表,回车
【可选】修改mysql端口(生产环境安全考虑)
查看端口
2.2 安装完成后创建ambari数据库及用户
#进入mysql shell
mysql -uroot -p
create database ambari character set utf8 ;
CREATE USER ‘ambari‘@‘%‘IDENTIFIED BY ‘ambari123‘;
GRANT ALL PRIVILEGES ON *.* TO ‘ambari‘@‘%‘;
FLUSH PRIVILEGES;
如果要安装Hive,创建Hive数据库和用户
create database hive character set utf8 ;
CREATE USER ‘hive‘@‘%‘IDENTIFIED BY ‘hive123‘;
GRANT ALL PRIVILEGES ON *.* TO ‘hive‘@‘%‘;
FLUSH PRIVILEGES;
如果要安装Oozie,创建Oozie数据库和用户
create database oozie character set utf8 ;
CREATE USER ‘oozie‘@‘%‘IDENTIFIED BY ‘oozie123‘;
GRANT ALL PRIVILEGES ON *.* TO ‘oozie‘@‘%‘;
FLUSH PRIVILEGES;
3、安装 ambari
yum install ambari-server
4、下载mysql驱动
http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.40/
将mysql-connector-Java.jar复制到/usr/share/java目录下
mkdir /usr/share/java
cp mysql-connector-java-5.1.40.jar /usr/share/java/mysql-connector-java.jar
将mysql-connector-java.jar复制到/var/lib/ambari-server/resources目录下
cp mysql-connector-java-5.1.40.jar /var/lib/ambari-server/resources/mysql-jdbc-driver.jar
5、编辑/etc/ambari-server/conf/ambari.properties,添加如下内容
server.jdbc.driver.path=/usr/share/java/mysql-connector-java.jar
#【可选】修改默认8080端口
#client.api.port=18080
6、ambaria初始化
#设置mysql驱动
ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
ambari-server setup
完整初始化如下:
[[email protected] ~]# ambari-server setup
Using python /usr/bin/python
Setup ambari-server
Checking SELinux...
SELinux status is ‘disabled‘
Customize user account for ambari-server daemon [y/n] (n)? y
Enter user account for ambari-server daemon (root):
Adjusting ambari-server permissions and ownership...
Checking firewall status...
Checking JDK...
[1] Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
[2] Oracle JDK 1.7 + Java Cryptography Extension (JCE) Policy Files 7
[3] Custom JDK
==============================================================================
Enter choice (1): 3
WARNING: JDK must be installed on all hosts and JAVA_HOME must be valid on all hosts.
WARNING: JCE Policy files are required for configuring Kerberos security. If you plan to use Kerberos,please make sure JCE Unlimited Strength Jurisdiction Policy Files are valid on all hosts.
Path to JAVA_HOME: /opt/java/jdk1.8.0_181
Validating JDK on Ambari Server...done.
Checking GPL software agreement...
GPL License for LZO: https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Enable Ambari Server to download and install GPL Licensed LZO packages [y/n] (n)? y
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? y
Configuring database...
==============================================================================
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL / MariaDB
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
[7] - BDB
==============================================================================
Enter choice (1): 3
Hostname (localhost):
Port (3306):
Database name (ambari):
Username (ambari):
Enter Database Password (bigdata):
Re-enter password:
Configuring ambari database...
Configuring remote database connection properties...
WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql
Proceed with configuring remote database connection properties [y/n] (y)? y
Extracting system views...
.....ambari-admin-2.6.2.0.155.jar
......
Adjusting ambari-server permissions and ownership...
Ambari Server ‘setup‘ completed successfully.
说明:
1:提示是否自定义设置。输入:y 或 按回车继续
2:ambari-server 账号。使用root(推荐)用户直接回车 若使用ambari用户输入:ambari
3:设置JDK,输入数字3,选客户端安装jdk,如果服务器可以访问外网,则可以选1,自动下载jdk1.8,默认下载的安装目录是/usr/java/default
4:如果上面选择3自定义JDK,则需要设置JAVA_HOME
5:输入:y 或 按回车继续
6:输入“y“,进行数据库的高级配置
7:输入“3“,选择mysql/mariadb数据库
8:设置数据库的配置参数:host,port,database,username,password。根据实际情况输入,如果和括号内相同,则可以直接回车。若端口改6033则输入port:6033。
9:输入:y 或 按回车继续
(若设置更改,可以再次执行ambari-server setup
进行设置)
7、将Ambari数据库脚本导入到数据库
#用Ambari用户(上面设置的用户)登录mysql
mysql -u ambari -p
use ambari;
source /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql
8、启动Ambari
ambari-server start
浏览器访问:http://hdc-data1:8080/ 默认登录用户:admin,密码:admin
9、WEBUI配置
创建集群名称
选择HDP版本和选择本地仓库
配置HDP的Repository
输入集群节点host(FQDN)和Ambari节点SSH的私钥
等待ambari-agents注册
若报错
【遇到问题 Confirm Hosts】
注册ambari-agents时failed
NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579)
SSLError: Failed to connect. Please check openssl library versions.
解决:如下第三个方案,/etc/ambari-agent/conf/ambari-agent.ini文件只有执行该步骤进行注册才生成。修改后点击“Retry Failed”按钮,等待注册成功进入下一步。
# 1、yum upgrade openssl 已是最新本版,未解决
# 2、vi /etc/python/cert-verification.cfg 修改 verify=disable,未解决
[https]
#verify=platform_default
verify=disable
# 3、最后解决方案 在ambari-agent的配置文件/etc/ambari-agent/conf/ambari-agent.ini
在 [security]标签下面增加一项
[security]
force_https_protocol=PROTOCOL_TLSv1_2
选择安装组件,(最小化安装HDFS+YARN+MAPREDUCE2+Ambari Metrics+SmartSense+ZooKeeper+Hbase)不必要的组件可以暂时不安装节省安装时间,后续可以再添加安装组件
实际部署时需按规划设计进行分配(注意:Hbase HA 的话在Hbase Master 点击后面的小绿色+号,添加standby master):
工作节点角色分配(生产环境全勾了)
组件配置信息修改
通常数据目录修改到/data/目录下,有红色信息提示的组件表示要输入相应的账号密码。以下示例,可根据自己的实际情况配置。
HDFS
HIVE,若数据库端口修改按实际填写
Oozie
Zookeeper
暂时修改了以下属性,具体配置信息可根据实际情况后续相应修改
端口属性 | 修改值 |
---|---|
dfs.namenode.http-address | server1.hadoop:5070 (def:50070) |
yarn.resourcemanager.webapp.address | server2.hadoop:8888 (def:8088) |
yarn.resourcemanager.webapp.https.address | server2.hadoop:8890(def:8090) |
mapreduce.jobhistory.webapp.address | server2.hadoop:18888 (def:19888) |
属性 | 修改值 |
---|---|
HDFS | |
NameNode | /data/hadoop/hdfs/namenode |
DataNode | /data/hadoop/hdfs/data |
SecondaryNameNode Checkpoint directories | /data/hadoop/hdfs/namesecondary |
Hadoop PID Dir Prefix | /data/var/run/hadoop |
Hadoop Log Dir Prefix | /data/var/log/hadoop |
dfs.journalnode.edits.dir | /data/hadoop/hdfs/journalnode |
Yarn | |
yarn.nodemanager.local-dirs | /data/hadoop/yarn/local |
yarn.nodemanager.log-dirs | /data/hadoop/yarn/log |
yarn.timeline-service.leveldb-state-store.path | /data/hadoop/yarn/timeline |
yarn.timeline-service.leveldb-timeline-store.path | /data/hadoop/yarn/timeline |
YARN Log Dir Prefix | /data/var/log/hadoop-yarn |
YARN PID Dir Prefix | /data/var/run/hadoop-yarn |
Mapreduce | |
Mapreduce Log Dir Prefix | /data/var/log/hadoop-mapreduce |
Mapreduce PID Dir Prefix | /data/var/run/hadoop-mapreduce |
mapreduce.jobhistory.recovery.store.leveldb.path | /data/hadoop/mapreduce/jhs |
Hive | |
Hive Log Dir | /data/var/log/hive |
Hive PID Dir | /data/var/run/hive |
HBase | |
HBase Log Dir Prefix | /data/var/log/hbase |
HBase PID Dir | /data/var/run/hbase |
Oozie | |
Oozie Data Dir | /data/hadoop/oozie/data |
Oozie Log Dir | /data/var/log/oozie |
Oozie PID Dir | /data/var/run/oozie |
zookeeper | |
ZooKeeper directory | /data/hadoop/zookeeper |
ZooKeeper Log Dir | /data/var/log/zookeeper |
ZooKeeper PID Dir | /data/var/run/zookeeper |
ambari-infra | |
Metrics Collector log dir | /data/var/log/ambari-metrics-collector |
Metrics Collector pid dir | /data/var/run/ambari-metrics-collector |
Metrics Monitor log dir | /data/var/log/ambari-metrics-monitor |
Metrics Monitor pid dir | /data/var/run/ambari-metrics-monitor |
Aggregator checkpoint directory | /data/var/lib/ambari-metrics-collector/checkpoint |
Metrics Grafana data dir | /data/var/lib/ambari-metrics-grafana |
Metrics Grafana log dir | /data/var/log/ambari-metrics-grafana |
Metrics Grafana pid dir | /data/var/run/ambari-metrics-grafana |
hbase_log_dir | /data/var/log/ambari-metrics-collector |
hbase_pid_dir | /data/var/run/ambari-metrics-collector/ |
hbase.tmp.dir | /data/var/lib/ambari-metrics-collector/hbase-tmp |
ambari-infra | |
Infra Solr Client log dir | /data/var/log/ambari-infra-solr-client |
Infra Solr log dir | /data/var/log/ambari-infra-solr |
Infra Solr pid dir | /data/var/run/ambari-infra-solr |
spark | |
livy2_log_dir | /data/var/log/livy2 |
livy2_pid_dir | /data/var/run/livy2 |
spark_log_dir | /data/var/log/spark2 |
spark_pid_dir | /data/var/run/spark2 |
等待安装完成
直到所有节点成功安装完成才能进行下一步操作
最终结果界面类似如下:
安装完成!
三、后续操作
操作前最好先将集群打个快照备份
1、HDFS启用HA(高可用)
参考http://www.louisvv.com/archives/1490.html
step1:关闭hbase、hive相关服务
step2:在HDFS选择Enable NameNode HA
step3:输入namenode ha serverid
step4:默认,进入下一步
step5:预览,直接进入下一步
step6:创建checkpoint,根据提示执行下列命令
根据提示在server1执行下列两条命令
命令执行完成方可进入下一步
step7:等待配置进程完成
step8:Journalnode初始化
需要在原NameNode节点执行下列命令
sudo su hdfs -l -c ‘hdfs namenode -initializeSharedEdits‘
step9: 启动组件,进入下一步
step10:初始化元数据,
sudo su hdfs -l -c ‘hdfs zkfc -formatZK‘
注意在新添加的NameNode节点上操作
sudo su hdfs -l -c ‘hdfs namenode -bootstrapStandby‘
step11:等待所有服务重启,完成
作者:粮忆雨
链接:https://www.jianshu.com/p/abcb22a47652
来源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
以上是关于Ambari2.6.2 HDP2.6.5 大数据集群搭建的主要内容,如果未能解决你的问题,请参考以下文章