centos8上安装hbase
Posted PacosonSWJTU
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了centos8上安装hbase相关的知识,希望对你有一定的参考价值。
【README】
1.本文部分内容转自:
https://computingforgeeks.com/how-to-install-apache-hadoop-hbase-on-centos-7/
2.本文是在单机上安装hbase (仅用于学习交流);
【1】更新系统
因为 hadoop和hbase是动态的,为便于hbase能够最大限度访问系统资源和网络权限,安装hbase前先关闭 SELinux与防火墙;
sudo systemctl disable --now firewalld
sudo setenforce 0
sudo sed -i 's/^SELINUX=.*/SELINUX=permissive/g' /etc/selinux/config
cat /etc/selinux/config | grep SELINUX= | grep -v '#'
更新系统(软件包)并重启
sudo yum -y install epel-release
sudo yum -y install vim wget curl bash-completion
sudo yum -y update
sudo reboot
【2】安装java
sudo yum -y install java-1.8.0-openjdk java-1.8.0-openjdk-devel
校验java版本
[root@centos202 ~]# java -version
java version "1.8.0_271"
Java(TM) SE Runtime Environment (build 1.8.0_271-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.271-b09, mixed mode)
设置 JAVA_HOME 环境变量
cat <<EOF | sudo tee /etc/profile.d/hadoop_java.sh
export JAVA_HOME=\\$(dirname \\$(dirname \\$(readlink \\$(readlink \\$(which javac)))))
export PATH=\\$PATH:\\$JAVA_HOME/bin
EOF
更新 $PATH变量和设置
source /etc/profile.d/hadoop_java.sh
[root@centos202 profile.d]# echo $JAVA_HOME
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64
【3】创建hadoop账号
创建独立的hadoop账号;
sudo adduser hadoop
passwd hadoop
sudo usermod -aG wheel hadoop
生成ssh key用于免密登录
[root@centos202 ~]# sudo su - hadoop
[hadoop@centos202 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:HuqG5V6O7Od64sQdmXUFedVNc/GEda4ujA+xuivxs2k hadoop@centos202
The key's randomart image is:
+---[RSA 3072]----+
| .o.B@|
| ..ooB|
| . .. o|
| + . . |
| S . . |
| .o+ o = . |
| ++o+ + o . |
| .+=+Eo o . |
| =OOO= . |
+----[SHA256]-----+
把用户hadoop添加到ssh免密登录授权列表;
[hadoop@centos202 ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@centos202 ~]$ chmod 0600 ~/.ssh/authorized_keys
使用生成的ssh key登录本机
[hadoop@centos202 ~]$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:EdoFy44sWPaZHE6jgJCVGkbGKxK63ToPAP24sQ2Gj3Y.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Last login: Sun Mar 5 03:56:07 2023
【4】下载并安装hadoop
下载 hadoop, hadoop安装包参见 https://hadoop.apache.org/releases.html ;
方式1)可以用 wget直接下载
wget https://www-eu.apache.org/dist/hadoop/common/hadoop-$RELEASE/hadoop-$RELEASE.tar.gz
方式2)利用代理下载到本地(window10),然后通过 rz 从windows传输到 centos(本文采用);
本文版本:hadoop-3.2.4.tar.gz
[root@centos202 hadoop]# ls -l
total 480832
-rwxrwxrwx. 1 root root 492368219 Jan 30 22:20 hadoop-3.2.4.tar.gz
解压
tar -xzvf hadoop-3.2.4.tar.gz
# 结果
[root@centos202 hadoop-3.2.4]# pwd
/usr/local/hadoop/hadoop-3.2.4
[root@centos202 hadoop-3.2.4]# ls
bin etc include lib libexec LICENSE.txt NOTICE.txt README.txt sbin share
把 hadoop家目录添加到 $PATH
cat <<EOF | sudo tee /etc/profile.d/hadoop_java.sh
export JAVA_HOME=\\$(dirname $(dirname $(readlink $(readlink $(which javac)))))
export HADOOP_HOME=/usr/local/hadoop/hadoop-3.2.4
export HADOOP_HDFS_HOME=\\$HADOOP_HOME
export HADOOP_MAPRED_HOME=\\$HADOOP_HOME
export YARN_HOME=\\$HADOOP_HOME
export HADOOP_COMMON_HOME=\\$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=\\$HADOOP_HOME/lib/native
export PATH=\\$PATH:\\$JAVA_HOME/bin:\\$HADOOP_HOME/bin:\\$HADOOP_HOME/sbin
EOF
source命令刷新当前执行环境,添加hadoop_java.sh文件中定义的环境变量;
source /etc/profile.d/hadoop_java.sh
查看hadoop版本:
[root@centos202 hadoop-3.2.4]# hadoop version
Hadoop 3.2.4
Source code repository Unknown -r 7e5d9983b388e372fe640f21f048f2f2ae6e9eba
Compiled by ubuntu on 2022-07-12T11:58Z
Compiled with protoc 2.5.0
From source with checksum ee031c16fe785bbb35252c749418712
This command was run using /usr/local/hadoop/hadoop-3.2.4/share/hadoop/common/hadoop-common-3.2.4.jar
【5】配置hadoop
所有hadoop的配置都在 /usr/local/hadoop/hadoop-3.2.4/etc/hadoop 目录下;
[root@centos202 hadoop]# pwd
/usr/local/hadoop/hadoop-3.2.4/etc/hadoop
[root@centos202 hadoop]# ls
capacity-scheduler.xml hadoop-policy.xml kms-acls.xml mapred-queues.xml.template yarn-env.cmd
configuration.xsl hadoop-user-functions.sh.example kms-env.sh mapred-site.xml yarn-env.sh
container-executor.cfg hdfs-site.xml kms-log4j.properties shellprofile.d yarnservice-log4j.properties
core-site.xml httpfs-env.sh kms-site.xml ssl-client.xml.example yarn-site.xml
hadoop-env.cmd httpfs-log4j.properties log4j.properties ssl-server.xml.example
hadoop-env.sh httpfs-signature.secret mapred-env.cmd user_ec_policies.xml.template
hadoop-metrics2.properties httpfs-site.xml mapred-env.sh workers
许多配置文件需要修改以完成hadoop的安装;
【5.1】hadoop-env.sh
编辑 hadoop-env.sh 文件的 JAVA_HOME (54行)
vim hadoop-env.sh
export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))
编辑 core-site.xml 文件
core-site.xml 文件包含hadoop集群启动所需信息,其属性包括:
hadoop实例端口号;
文件系统分配内存大小;
数据存储的内存限制;
读写缓冲区大小;
编辑如下: 在 <configuration>元素内新增文件系统属性:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The default file system URI</description>
</property>
</configuration>
【5.2】hdfs-site.xml
这个文件是集群中所有主机都需要配置的问题。其包含的内容如下。
namenode和datanode在文件系统中的路径;
副本数据的值
创建namenode 与 datanode 文件夹, 把 hadoop文件所有者修改为 hadoop:hadoop
[hadoop@centos202 hadoop]$ sudo mkdir -p /hadoop/hdfs/namenode,datanode
[sudo] password for hadoop:
[hadoop@centos202 hadoop]$
[hadoop@centos202 hadoop]$ sudo chown -R hadoop:hadoop /hadoop
编辑hdfs-site.xml ,如下:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///hadoop/hdfs/datanode</value>
</property>
</configuration>
【5.3】mapred-site.xml
用于设置 mapreduce框架;
编辑如下:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
【5.4】yarn-site.xml
yarn-site.xml 定义了资源管理和job调度逻辑。编辑如下;
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
【6】验证hadoop配置(启动hadoop)
切换到haodop ,
sudo su - hadoop
【6.1】格式化 hdfs namenode
格式化的作用是: 删除hdfs的所有文件夹;临时文件夹包含 datanode和namenode,如果格式化namenode,这些文件都会变为空。
小结:namenode维护了与datanode关联的元数据,当我们格式化时,也会格式化这些元数据,以便新数据复用。
you also refer2 https://stackoverflow.com/questions/27143409/what-the-command-hadoop-namenode-format-will-do
【6.2】启动hdfs
[hadoop@centos202 ~]$ start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [centos202]
centos202: Warning: Permanently added 'centos202,192.168.163.202' (ECDSA) to the list of known hosts.
2023-03-05 05:18:42,008 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@centos202 ~]$
【6.3】启动yarn
[hadoop@centos202 ~]$ start-yarn.sh
Starting resourcemanager
Starting nodemanagers
【6.4】hadoop web ui界面
1)hadoop3.x 的 默认web ui端口如下:
namenode(hadoop仪表盘):9870
resource manager(hadop集群概览): 8088
MapReduce job history server:19888
我们导出hadoop使用的端口:
[hadoop@centos202 ~]$ ss -tunelp | grep java
tcp LISTEN 0 128 0.0.0.0:8030 0.0.0.0:* users:(("java",pid=15007,fd=320)) uid:1000 ino:81561 sk:1 <->
tcp LISTEN 0 128 0.0.0.0:8031 0.0.0.0:* users:(("java",pid=15007,fd=310)) uid:1000 ino:80766 sk:2 <->
tcp LISTEN 0 128 0.0.0.0:8032 0.0.0.0:* users:(("java",pid=15007,fd=330)) uid:1000 ino:81957 sk:3 <->
tcp LISTEN 0 128 0.0.0.0:8033 0.0.0.0:* users:(("java",pid=15007,fd=299)) uid:1000 ino:80016 sk:4 <->
tcp LISTEN 0 128 0.0.0.0:41059 0.0.0.0:* users:(("java",pid=15158,fd=306)) uid:1000 ino:88126 sk:5 <->
tcp LISTEN 0 128 127.0.0.1:44965 0.0.0.0:* users:(("java",pid=14546,fd=279)) uid:1000 ino:74525 sk:6 <->
tcp LISTEN 0 128 0.0.0.0:8040 0.0.0.0:* users:(("java",pid=15158,fd=317)) uid:1000 ino:88014 sk:7 <->
tcp LISTEN 0 128 0.0.0.0:9864 0.0.0.0:* users:(("java",pid=14546,fd=308)) uid:1000 ino:74543 sk:8 <->
tcp LISTEN 0 128 127.0.0.1:9000 0.0.0.0:* users:(("java",pid=14418,fd=285)) uid:1000 ino:70479 sk:9 <->
tcp LISTEN 0 128 0.0.0.0:8042 0.0.0.0:* users:(("java",pid=15158,fd=328)) uid:1000 ino:88858 sk:a <->
tcp LISTEN 0 128 0.0.0.0:9866 0.0.0.0:* users:(("java",pid=14546,fd=278)) uid:1000 ino:74483 sk:b <->
tcp LISTEN 0 128 0.0.0.0:9867 0.0.0.0:* users:(("java",pid=14546,fd=309)) uid:1000 ino:74560 sk:c <->
tcp LISTEN 0 128 0.0.0.0:9868 0.0.0.0:* users:(("java",pid=14770,fd=279)) uid:1000 ino:77941 sk:d <->
tcp LISTEN 0 128 0.0.0.0:9870 0.0.0.0:* users:(("java",pid=14418,fd=274)) uid:1000 ino:70244 sk:e <->
tcp LISTEN 0 128 0.0.0.0:8088 0.0.0.0:* users:(("java",pid=15007,fd=289)) uid:1000 ino:78820 sk:10 <->
tcp LISTEN 0 128 0.0.0.0:13562 0.0.0.0:* users:(("java",pid=15158,fd=327)) uid:1000 ino:89419 sk:11 <->
2)访问 centos202:9870 查看hadoop 数据仪表盘 (虚拟机主机名为centos202,也可以通过ip地址访问)
3)访问 centos202:8088 查看hadoop集群概览
【6.5】创建 hdfs 文件夹
[hadoop@centos202 ~]$ hadoop fs -mkdir /test
[hadoop@centos202 ~]$
[hadoop@centos202 ~]$ hadoop fs -ls /
drwxr-xr-x - hadoop supergroup 0 2023-03-05 05:29 /test
【补充】停止 hadoop 服务 , 停止 hdfs,yarn
[hadoop@centos202 ~]$ stop-dfs.sh
Stopping namenodes on [localhost]
Stopping datanodes
Stopping secondary namenodes [centos202]
[hadoop@centos202 ~]$ stop-yarn.sh
Stopping nodemanagers
Stopping resourcemanager
[hadoop@centos202 ~]$
【7】安装hbase
【7.1】 下载并安装hbase
hbase安装包, refer2 http://apache.mirror.gtcomm.net/hbase/
本文用的版本是 hbase-2.4.15 ; 可以用 wget,也可以用代理下直到本地,然后用rz传输到centos(本文采用这种);
解压
sudo tar -xzvf hbase-2.4.15-bin.tar.gz
[hadoop@centos202 hbase-2.4.15]$ pwd
/usr/local/hbase/hbase-2.4.15
[hadoop@centos202 hbase-2.4.15]$ ls
bin CHANGES.md conf docs hbase-webapps LEGAL lib LICENSE.txt NOTICE.txt README.txt RELEASENOTES.md
更新 $PATH 环境变量
cat <<EOF | sudo tee /etc/profile.d/hadoop_java.sh
export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))
export HADOOP_HOME=/usr/local/hadoop/hadoop-3.2.4
export HADOOP_HDFS_HOME=\\$HADOOP_HOME
export HADOOP_MAPRED_HOME=\\$HADOOP_HOME
export YARN_HOME=\\$HADOOP_HOME
export HADOOP_COMMON_HOME=\\$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=\\$HADOOP_HOME/lib/native
export HBASE_HOME=/usr/local/hbase/hbase-2.4.15
export PATH=\\$PATH:\\$JAVA_HOME/bin:\\$HADOOP_HOME/bin:\\$HADOOP_HOME/sbin:\\$HBASE_HOME/bin
EOF
5.刷新shell环境变量, 并验证 HBASE_HOME
[hadoop@centos202 hbase-2.4.15]$ source /etc/profile.d/hadoop_java.sh
[hadoop@centos202 conf]$ echo $HBASE_HOME
/usr/local/hbase/hbase-2.4.15
6.编辑 hbase-env.sh , 设置 JAVA_HOME
[hadoop@centos202 conf]$ pwd
/usr/local/hbase/hbase-2.4.15/conf
[hadoop@centos202 conf]$
[hadoop@centos202 conf]$
[hadoop@centos202 conf]$ vim hbase-env.sh
修改28行为:
export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))
【7.2】配置 hbase (单机安装)
1)就像配置hadoop 一样, 配置hbase; hbase所有的配置文件在 /usr/local/hbase/hbase-2.4.15/conf 目录下;
2)单机模式下: 所有后台线程(HMaster, HRegionServer, zk)运行在单虚拟机上;
【7.2.1】创建hbase根文件夹
[hadoop@centos202 conf]$ sudo mkdir -p /hadoop/hbase/hfile
[hadoop@centos202 conf]$ sudo mkdir -p /hadoop/zookeeper
[hadoop@centos202 conf]$ sudo chown -R hadoop:hadoop /hadoop/
【7.2.2】编辑 hbase-site.xml 文件
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:/hadoop/hbase/hfile</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/hadoop/zookeeper</value>
</property>
</configuration>
【补充】默认情况下, 除非你配置了 hbase.rootdir ,否则 你的数据仍然存储在 /tmp/ 目录下;
【7.3】启动hbase
启动 hbase
[hadoop@centos202 conf]$ start-hbase.sh
【补充】hbase安装到集群可以参考 https://computingforgeeks.com/how-to-install-apache-hadoop-hbase-on-centos-7/ 的 option 2.
【7.4】管理HMaster和 HRegionServer (仅参考)
HMaster服务器 控制hbase集群。你可以启动最多9个备用HMaster服务器,共计10个。
HRegionServer 按照 HMaster的指示去管理 StoreFile中的数据。 一般情况,一个HRegionServer 运行在集群的一个节点上。
HMaster 和 HRegionServer 分别用 命令 local-master-backup.sh , local-regionservers.sh 来启动和停止,如下。
local-master-backup.sh start 2 # 启动备用HMaster
local-regionservers.sh start 3 # 启动多个 RegionServers
local-regionservers.sh stop 3 # 停止多个 RegionServers
【补充】
每一个HMaster 使用2个端口(160000 16010)。
【8】启动hbase shell脚本
hadoop 与 hbase 应该在 运行hbase shell 之前运行,如下:
start-dfs.sh
start-yarn.sh
start-hbase.sh
【补充】 start-all.sh 可以代替start-dfs 和 start-yarn.sh
启动 hbase shell
hbase shell
关闭 hbase
[hadoop@centos202 conf]$ stop-hbase.sh
以上是关于centos8上安装hbase的主要内容,如果未能解决你的问题,请参考以下文章
大数据技术之HBaseHBase简介HBase快速入门HBase进阶