CDH6部署搭建笔记
Posted 潇湘神剑
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了CDH6部署搭建笔记相关的知识,希望对你有一定的参考价值。
一、环境配置
1、主机配置
cdh-master 192.168.80.107 CPU4核 内存16G 磁盘200G cdh-node1 192.168.80.140 CPU4核 内存8G 磁盘200G cdh-node2 192.168.80.148 CPU4核 内存8G 磁盘200G
数据库 192.168.90.100 CPU4核 内存8G 磁盘200G # 主机是使用vmware esxi虚拟出来的测试环境,另外主机名一定不要用大写字母,会报错。
2、安装包下载
目前CDH官网已经无法直接下载安装包了,需要订阅才能通过账户密码下载。所以只能用前辈遗留下来的安装包进行安装。cloudera manager版本为6.3.1,CDH版本为6.3.2,截止到6.3.2还能免费用基础版本的,再往上已经没有基础版本了,只有60天的试用版本和订阅版本。
3、搭建环境准备
基本配置参考我的这篇博客:https://www.cnblogs.com/zhangzhide/p/11108472.html
数据库的安装参考我的这篇博客:https://www.cnblogs.com/zhangzhide/p/11124064.html
4、创建CDH所需的数据库和用户
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO \'scm\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON amon.* TO \'amon\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON rman.* TO \'rman\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON hue.* TO \'hue\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON metastore.* TO \'metastore\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON sentry.* TO \'sentry\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON nav.* TO \'nav\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON navms.* TO \'navms\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON oozie.* TO \'oozie\'@\'%\' IDENTIFIED BY \'123.com\'; GRANT ALL ON hive.* TO \'hive\'@\'%\' IDENTIFIED BY \'123.com\';
拷贝 JDBC 驱动包到指定目录: cp mysql-connector-java-8.0.16.jar /usr/share/java/mysql-connector-java.jar
二、正式开始部署
1、安装Clouder Manager Server
# 只在cdh-master上安装 yum install cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm -y
2、安装Clouder Manager Agent
# 机器富裕的可以只在cdh-node节点上部署agent(数量>=3),不富裕的就在master上也部署上agent,因为后续安装hdfs时,hdfs的副本数必须大于等于3,否则hdfs会安装不成功。 yum install cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm -y
# 修改agent配置文件 vim /etc/cloudera-scm-agent/config.ini 将 server_host=localhost 改为 server_host=cdh-master
3、初始化 Clouder Manager 数据库
# 在cdh-master执行 cd /opt/cloudera/cm/schema/ bash scm_prepare_database.sh -h 192.168.90.100 mysql scm scm 123.com # 出现如下信息则初始化成功 JAVA_HOME=/usr/local/jdk1.8.0_181 Verifying that we can write to /etc/cloudera-scm-server Creating SCM configuration file in /etc/cloudera-scm-server Executing: /usr/local/jdk8/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db. Loading class `com.mysql.jdbc.Driver\'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver\'. The driver is automatically registered via the SPI and manual loading ofthe driver class is generally unnecessary. [ main] DbCommandExecutor INFO Successfully connected to database. All done, your SCM database is configured correctly!
4、拷贝 Parcel 文件到指定目录
# 在cdh-master上执行 cp CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel /opt/cloudera/parcel-repo/ cp CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha cp CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha256 /opt/cloudera/parcel-repo/ cp manifest.json /opt/cloudera/parcel-repo/ chown cloudera-scm:cloudera-scm -R /opt/cloudera/parcel-repo # 如果不改属主属组,则在选择存储库阶段找不到CDH # 注意: 我手里的资源是"CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1",需要改名成"CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha"否则在安装cdh时会找不到CDH安装包。
5、启动Clouder Manager Server和Agent
# 在启动Server和Agent之前需要做一个工作,就是需要将你自己的java环境变量超链接到/usr/java/default,因为cloudera 的脚本中定义的JAVA_PATH是/usr/java/default,否则clouder manager server启动会失败。
# Agent上之所以也要这么干是因为后续在部署hdfs、yarn、zookeeper时也会用到。 ln -s /usr/local/jdk1.8.0_181 /usr/java/default
不做超链接的报错信息:
# 启动 server systemctl start cloudera-scm-server systemctl enable cloudera-scm-server # 启动agent systemctl start cloudera-scm-agent systemctl enable cloudera-scm-agent
三、登录 Clouder Manager 完成集群安装
1、登录
默认登录用户名密码:admin / admin
2、安装免费版
3、选择存储库阶段
这个地方如果找不到CDH-6.3.2-1.cdh6.3.2.p0.1605554这个选项时,就需要检查“4、拷贝 Parcel 文件到指定目录”这一步是不是没做。
4、集群状态检查
如果检查到不合适的地方,根据提示更改即可
后续的步骤可以根据自己的需求选择即可。
5、集群安装部署阶段
要想达到这个图片的阶段,就需要踩许多坑。但是好在它的报错日志够清楚,基本也都能解决,我就总结一下我遇到的,比如在安装hdfs时,提示副本数不足,则返回上一步,增加hdfs的副本数。还有在安装yarn、zookeeper时,执行安装脚本时,脚本报错找不到JAVA_HOME,这个也好解决,它会提示你具体是哪个脚本的哪行报错了,把脚本改改就成了。
如上图所示,这个脚本就在“if [ -z $JAVA_HOME/bin/java]”这一行报/bin/java无法找到,这明显就是$JAVA_HOME没有获取到值,那咱就帮它获取,咱直接给它指定了,于是就在这个函数中给它强制指定JAVA_HOME的值是多少。
6、搞定
大数据之—CDH搭建
大数据之—CDH搭建
前言
1、CDH概述
Cloudera版本(Cloudera’s Distribution Including Apache Hadoop,简称“CDH”),基于Web的用户界面,支持大多数Hadoop组件,包括HDFS、MapReduce、Hive、Pig、 Hbase、Zookeeper、Sqoop,简化了大数据平台的安装、使用难度。
由于组件齐全,安装维护方便,国内已经有不少公司部署了CDH大数据平台,此处选择CDH 6.3版本。
2、安装CDH前准备
推荐硬件配置: 每台主机:CPU4核、内存8G、硬盘500G
主机配置:
192.168.8.137 node1
192.168.8.138 node2
192.168.8.139 node3
软件版本:
- 操作系统:CentOS release 7.8 (Final) 64位
- JDK:1.8
- 数据库:MySQL 5.6.49
- JDBC:MySQL Connector Java 5.1.38
- Cloudera Manager: 6.3.1
- CDH:6.3.1
3、配置
- 配置
# node1
echo "HOSTNAME=node1" >> /etc/sysconfig/network
# node2
echo "HOSTNAME=node2" >> /etc/sysconfig/network
# node3
echo "HOSTNAME=node3" >> /etc/sysconfig/network
- 配置
所有节点关闭防火墙
systemctl disable firewalld;systemctl stop firewalld
- 配置
所有节点配置SeLinux
vim /etc/selinux/config
# 修改内容
SELINUX=permissive
同步配置:xsync /etc/selinux/config
- 配置
所有节点NTP时间服务配置
yum install ntp
systemctl start ntpd
systemctl enable ntpd
- 配置
所有节点安装python
CDH要求python 2.7版本,此处系统自带,略过
- 配置
所有节点修改Linux swappiness参数
为了避免服务器使用swap功能而影响服务器性能,一般都会把vm.swappiness修改为0(cloudera建议10以下)
cd /usr/lib/tuned/ && grep "vm.swappiness" * -R
# 将以下三个文件的 vm.swappiness 的值全部修改为0
vim /usr/lib/tuned/latency-performance/tuned.conf
vim /usr/lib/tuned/throughput-performance/tuned.conf
vim /usr/lib/tuned/virtual-guest/tuned.conf
同步配置:xsync /usr/lib/tuned/latency-performance/tuned.conf /usr/lib/tuned/throughput-performance/tuned.conf /usr/lib/tuned/virtual-guest/tuned.conf
- 配置
所有节点禁用透明页
vim /etc/rc.local
# 追加以下内容
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
此项配置无法同步
- 配置
所有节点安装JDK,已经安装过并且配置过环境变量的就不用重新安装了
cloudera-scm-server一直启动失败的解决办法:
https://blog.csdn.net/a544258023/article/details/107856387
jdk1.8改成自己的jdk安装目录
mkdir -p /usr/java
ln -s /opt/jdk/java8 /usr/java/default
- 配置
主节点:安装mysql数据库
此处安装MySQL5.6版本,安装步骤略过
- 配置
主节点安装mysql
docker run -d -p 3306:3306 --name mysql -e MYSQL_ROOT_PASSWORD=root mysql:5.7.41
docker exec -it mysql bash
# 创建CDH源数据库、用户、amon服务的数据库
mysql -uroot -proot
create database cmf DEFAULT CHARACTER SET utf8;
create database amon DEFAULT CHARACTER SET utf8;
grant all on cmf.* TO \'cmf\'@\'%\' IDENTIFIED BY \'www.research.com\';
grant all on amon.* TO \'amon\'@\'%\' IDENTIFIED BY \'www.research.com\';
flush privileges;
4、下载安装包
软件已经存到网盘中了,需要的话可自取
链接:https://pan.baidu.com/s/1UH50Uweyi7yg6bV7dl02mQ
提取码:nx7p
主节点安装MySQL的jdbc驱动
mkdir -p /opt/cdh/soft # cdh的资源上传目录
mkdir -p /usr/share/java && mv /opt/cdh/soft/mysql-connector-java-5.1.47.jar /usr/share/java/mysql-connector-java.jar && cd /usr/share/java && ll
部署CDH
mkdir -p /opt/cdh/cloudera-manager && cd /opt/cdh/cloudera-manager
tar -zvxf /opt/cdh/soft/cm6.3.1-redhat7.tar.gz -C /opt/cdh/cloudera-manager
同步资源:xsync /opt/cdh
所有节点都要安装的内容
# 有顺序要求,顺序不对启动就会导致文件没有权限,主要是某一个包在安装的时候会自动创建cloudera-scm用户与cloudera-scm组
rpm -ivh /opt/cdh/cloudera-manager/cm6.3.1/RPMS/x86_64/cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
rpm -ivh /opt/cdh/cloudera-manager/cm6.3.1/RPMS/x86_64/cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
只有主节点node1上安装的内容
rpm -ivh /opt/cdh/cloudera-manager/cm6.3.1/RPMS/x86_64/cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
所有节点 修改agent配置,指向server节点node1
sed -i "s/server_host=localhost/server_host=此处修改为主节点的ip/g" /etc/cloudera-scm-agent/config.ini
例如:sed -i "s/server_host=localhost/server_host=192.168.8.137/g" /etc/cloudera-scm-agent/config.ini
主节点node1修改server配置
chmod 777 /etc/cloudera-scm-server/db.properties && vim /etc/cloudera-scm-server/db.properties
# Copyright (c) 2012 Cloudera, Inc. All rights reserved.
#
# This file describes the database connection.
#
# The database type
# Currently \'mysql\', \'postgresql\' and \'oracle\' are valid databases.
# com.cloudera.cmf.db.type=mysql
# The database host
# If a non standard port is needed, use \'hostname:port\'
#com.cloudera.cmf.db.host=localhost
# The database name
#com.cloudera.cmf.db.name=cmf
# The database user
#com.cloudera.cmf.db.user=cmf
# The database user\'s password
#com.cloudera.cmf.db.password=
# The db setup type
# After fresh install it is set to INIT
# and will be changed post config.
# If scm-server uses Embedded DB then it is set to EMBEDDED
# If scm-server uses External DB then it is set to EXTERNAL
# com.cloudera.cmf.db.setupType=INIT
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=node1
com.cloudera.cmf.db.name=cmf
com.cloudera.cmf.db.user=cmf
com.cloudera.cmf.db.password=www.research.com
com.cloudera.cmf.db.setupType=EXTERNAL
主节点部署离线parcel源
mkdir -p /var/www/html/cdh6_parcel
cp /opt/cdh/soft/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel /var/www/html/cdh6_parcel/ && cp /opt/cdh/soft/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1 /var/www/html/cdh6_parcel/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha && cp /opt/cdh/soft/manifest.json /var/www/html/cdh6_parcel/ && ll /var/www/html/cdh6_parcel/ && systemctl start httpd
页面访问:http://node1/cdh6_parcel/
启动主节点
# 启动
systemctl start cloudera-scm-server
systemctl stop cloudera-scm-server
systemctl status cloudera-scm-server
# 查看文件夹(没有日志文件是因为安装顺序错了)
ll /var/log/cloudera-scm-server/
# 查看启动日志
tailf /var/log/cloudera-scm-server/cloudera-scm-server.log
journalctl -f -u cloudera-scm-server.service
所有节点启动
systemctl start cloudera-scm-agent
systemctl stop cloudera-scm-agent
systemctl status cloudera-scm-agent
web页面操作
登录主节点的7180端口:http://node1:7180/
登陆用户名:admin 登陆密码: admin
TODO ....
以上是关于CDH6部署搭建笔记的主要内容,如果未能解决你的问题,请参考以下文章