大数据环境搭建 更新中
Posted AI数据
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了大数据环境搭建 更新中相关的知识,希望对你有一定的参考价值。
系统 centos7
远程连接工具MobaXterm
一、虚拟机
虚拟机配置
下载安装VMware Station,下载centos7
新建虚拟机
下一步
稍后安装操作系统,下一步
操作系统选择,下一步
修改名称和位置,下一步
下一步
完成
新建虚拟机右键,虚拟机设置,CD/DVD选择ISO映像文件
开启虚拟机
选择语言
继续
点 安装位置
点 完成
软件选择 保持最小安装
开始安装
设置ROOT密码
zh**j**123
安装完成重启
打开网络连接
查看VMnet8属性,查看Internet协议版本4
记住IP地址和子网掩码
编辑,虚拟网络编辑器,选 VMnet8,取消勾选 使用本地DHCP服务将IP地址分配给虚拟机
点 NAT 设置,记住网关IP
虚拟机--->设置--->网络适配器,网络连接点 自定义,选VMnet8
进入系统
进入/etc/sysconfig/network-scripts目录,修改ifcfg-ens33
vi /etc/sysconfig/network-scripts/ifcfg-ens33
修改配置
TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=ens33 UUID=aae5b9e2-96b2-416f-a009-f8e0c041edca DEVICE=ens33 ONBOOT=yes IPADDR=192.168.147.8 NETMASK=255.255.255.0 GATEWAY=192.168.147.2 DNS=192.168.147.2 DNS1=8.8.8.8
BOOTPROTO=static,
设置网卡引导协议为 静态
ONBOOT=yes,
设置网卡启动方式为 开机启动
并且可以通过系统服务管理器 systemctl
控制网卡
重启网络服务
systemctl restart network
测试
[root@localhost network-scripts]# ping www.baidu.com PING www.wshifen.com (104.193.88.77) 56(84) bytes of data. 64 bytes from 104.193.88.77 (104.193.88.77): icmp_seq=2 ttl=128 time=256 ms 64 bytes from 104.193.88.77 (104.193.88.77): icmp_seq=3 ttl=128 time=321 ms
克隆另外两台主机,名称为bigdata2,bigdata3,ip为192.168.147.9、192.168.147.10
下一步
下一步
下一步
二、阿里云
2.1 阿里云准备
1.三台CES
2.若需要,购买公网弹性IP并绑定
3.若需要,可以购买云盘
挂载数据盘
阿里云购买的第2块云盘默认是不自动挂载的,需要手动配置挂载上。
(1)查看SSD云盘
sudo fdisk -l
可以看到SSD系统已经识别为/dev/vdb
(2)格式化云盘
sudo mkfs.ext4 /dev/vdb
(3)挂载
sudo mount /dev/vdb /opt
将云盘挂载到/opt目录下。
(4)配置开机自动挂载
修改/etc/fstab文件,文件末尾添加:
/dev/vdb /opt ext4 defaults 0 0
然后df -hl就可以看到第二块挂载成功咯
如果是正在使用中的系统盘容量不够了,扩容系统盘
yum install cloud-utils-growpart growpart /dev/vda 1 resize2fs /dev/vda1
三、准备
关闭防火墙
centos 7 默认使用的是firewall,不是iptables
systemctl stop firewalld.service
systemctl mask firewalld.service
关闭SELinux(所有节点)
vim /etc/selinux/config
设置SELINUX=disabled
修改主机名
分别命名为node01、node02、node03
以node01为例
[root@node01 ~]# hostnamectl set-hostname node01 [root@node01 ~]# cat /etc/hostname node01
已经修改,重新登录即可。
修改 /etc/hosts文件
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.147.8 node01 192.168.147.9 node02 192.168.147.10 node03
配置免密登录
生成私钥和公钥
ssh-keygen -t rsa
将公钥拷贝到要免密登录的目标机器上
ssh-copy-id node01 ssh-copy-id node02 ssh-copy-id node03
编写几个有用的脚本文件
使用rsync编写xsync
#!/bin/sh # 获取输入参数个数,如果没有参数,直接退出 pcount=$# if((pcount==0)); then echo no args...; exit; fi # 获取文件名称 p1=$1 fname=`basename $p1` echo fname=$fname # 获取上级目录到绝对路径 pdir=`cd -P $(dirname $p1); pwd` echo pdir=$pdir # 获取当前用户名称 user=`whoami` # 循环 for((host=1; host<=3; host++)); do echo $pdir/$fname $user@slave$host:$pdir echo ==================slave$host================== rsync -rvl $pdir/$fname $user@slave$host:$pdir done #Note:这里的slave对应自己主机名,需要做相应修改。另外,for循环中的host的边界值由自己的主机编号决定
xcall.sh
#! /bin/bash for host in node01 node02 node03 do echo ------------ $i ------------------- ssh $i "$*" done
执行上面脚本之前将/etc/profile中的环境变量追加到~/.bashrc中,否则ssh执行命令会报错
[root@node01 bigdata]# cat /etc/profile >> ~/.bashrc [root@node02 bigdata]# cat /etc/profile >> ~/.bashrc [root@node03 bigdata]# cat /etc/profile >> ~/.bashrc
创建/bigdata目录
JDK配置
下载JDK,这里我们下载JDK8,https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html
需要Oracale账号密码,可以网络搜索
上传JDK到各个节点的/bigdata目录下
解压缩
tar -zxvf jdk-8u241-linux-x64.tar.gz
文件属主和属组如果不是root进行修改,下面是
Linux系统按文件所有者、文件所有者同组用户和其他用户来规定了不同的文件访问权限。
1、chgrp:更改文件属组
语法:
chgrp [-R] 属组名 文件名
2、chown:更改文件属主,也可以同时更改文件属组
语法:
chown [–R] 属主名 文件名 chown [-R] 属主名:属组名 文件名
创建软连接
ln -s /root/bigdata/jdk1.8.0_241/ /usr/local/jdk
配置环境变量
vi /etc/profile
在最后面添加
export JAVA_HOME=/usr/local/jdk
export PATH=$PATH:${JAVA_HOME}/bin
加载配置文件
source /etc/profile
查看Java版本
[root@node03 bigdata]# java -version java version "1.8.0_241" Java(TM) SE Runtime Environment (build 1.8.0_241-b07) Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)
安装成功
安装mysql
安装Maven
http://maven.apache.org/download.cgi
下载,解压
tar -zxvf apache-maven-3.6.1-bin.tar.gz
建立软连接
ln -s /bigdata/apache-maven-3.6.3 /usr/local/maven
加入/etc/profile中
export M2_HOME=/usr/local/maven3
export PATH=$PATH:$M2_HOME/bin
安装Git
yum install git
四、Cloudera Manager 6.3.1安装
JDK位置
JAVA_HOME 一定要是 /usr/java/java-version
三台节点下载第三方依赖
yum install bind-utils psmisc cyrus-sasl-plain cyrus-sasl-gssapi fuse portmap fuse-libs /lib/lsb/init-functions httpd mod_ssl openssl-devel python-psycopg2 MySQL-python libxslt
配置仓库
版本 6.3.1
RHEL 7 Compatible | https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/ | cloudera-manager.repo |
下载cloudera-manager.repo 文件,放到Cloudera Manager Server节点的 /etc/yum.repos.d/ 目录 中
[root@node01 ~]# cat /etc/yum.repos.d/cloudera-manager.repo [cloudera-manager] name=Cloudera Manager 6.3.1 baseurl=https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/ gpgkey=https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPM-GPG-KEY-cloudera gpgcheck=1 enabled=1 autorefresh=0
安装Cloudera Manager Server
yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
如果速度太慢,可以去 https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/ 下载rpm包,上传到服务器进行安装
rpm -ivh cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
安装完后
[root@node01 cm]# ll /opt/cloudera/ total 16 drwxr-xr-x 27 cloudera-scm cloudera-scm 4096 Mar 3 19:36 cm drwxr-xr-x 8 root root 4096 Mar 3 19:36 cm-agent drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 Sep 25 16:34 csd drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 Sep 25 16:34 parcel-repo
所有节点
server_host=node01
配置数据库
安装mysql
修改密码,配置权限
移动引擎日志文件
将旧的InnoDB log files /var/lib/mysql/ib_logfile0 和 /var/lib/mysql/ib_logfile1 从 /var/lib/mysql/ 中移动到其他你指定的地方做备份
[root@node01 ~]# mv /var/lib/mysql/ib_logfile0 /bigdata [root@node01 ~]# mv /var/lib/mysql/ib_logfile1 /bigdata
更新my.cnf文件
默认在/etc/my.cnf目录中
[root@node01 etc]# mv my.cnf my.cnf.bak [root@node01 etc]# vi my.cnf
官方推荐配置
[mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock transaction-isolation = READ-COMMITTED # Disabling symbolic-links is recommended to prevent assorted security risks; # to do so, uncomment this line: symbolic-links = 0 key_buffer_size = 32M max_allowed_packet = 32M thread_stack = 256K thread_cache_size = 64 query_cache_limit = 8M query_cache_size = 64M query_cache_type = 1 max_connections = 550 #expire_logs_days = 10 #max_binlog_size = 100M #log_bin should be on a disk with enough free space. #Replace \'/var/lib/mysql/mysql_binary_log\' with an appropriate path for your #system and chown the specified folder to the mysql user. log_bin=/var/lib/mysql/mysql_binary_log #In later versions of MySQL, if you enable the binary log and do not set #a server_id, MySQL will not start. The server_id must be unique within #the replicating group. server_id=1 binlog_format = mixed read_buffer_size = 2M read_rnd_buffer_size = 16M sort_buffer_size = 8M join_buffer_size = 8M # InnoDB settings innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_log_buffer_size = 64M innodb_buffer_pool_size = 4G innodb_thread_concurrency = 8 innodb_flush_method = O_DIRECT innodb_log_file_size = 512M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid sql_mode=STRICT_ALL_TABLES
确保开机启动
systemctl enable mysqld
启动MySql
systemctl start mysqld
安装JDBC驱动
下载
wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
解压缩
tar zxvf mysql-connector-java-5.1.46.tar.gz
拷贝驱动到 /usr/share/java/ 目录中并重命名,如果没有创建该目录
[root@node01 etc]# mkdir -p /usr/share/java/ [root@node01 etc]# cd mysql-connector-java-5.1.46 [root@node01 mysql-connector-java-5.1.46]# cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
为CM组件配置MySQL数据库
Cloudera Manager Server, Oozie Server, Sqoop Server, Activity Monitor, Reports Manager, Hive Metastore Server, Hue Server, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server这些组件都需要建立数据库
Service | Database | User |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Hue | hue | hue |
Hive Metastore Server | metastore | hive |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Oozie | oozie | oozie |
登录mysql,输入密码
mysql -u root -p
Create databases for each service deployed in the cluster using the following commands. You can use any value you want for the <database>, <user>, and <password> parameters. The Databases for Cloudera Software table, below lists the default names provided in the Cloudera Manager configuration settings, but you are not required to use them.
Configure all databases to use the utf8 character set.
Include the character set for each database when you run the CREATE DATABASE statements described below.
为每个部属在集里的服务创建数据库,所有数据库都使用 utf8 character set
CREATE DATABASE <database> DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
赋权限
GRANT ALL ON <database>.* TO \'<user>\'@\'%\' IDENTIFIED BY \'<password>\';
实例
mysql> CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) mysql> CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) mysql> CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) mysql> CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) mysql> CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.01 sec) mysql> CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) mysql> CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.01 sec) mysql> mysql> CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) mysql> CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec)
mysql> GRANT ALL ON scm.* TO \'scm\'@\'%\' IDENTIFIED BY \'@Zhaojie123\'; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql> GRANT ALL ON amon.* TO \'amon\'@\'%\' IDENTIFIED BY \'@Zhaojie123\'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> GRANT ALL ON hive.* TO \'hive\'@\'%\' IDENTIFIED BY以上是关于大数据环境搭建 更新中的主要内容,如果未能解决你的问题,请参考以下文章