Hbase入门整理
Posted master-dragon
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hbase入门整理相关的知识,希望对你有一定的参考价值。
目录
hbase 安装
- 配置文件
conf/hbase-env.sh
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_171.jdk/Contents/Home
export HBASE_CLASSPATH=/Users/mubi/hadoop/hbase-2.2.6/conf
export HBASE_MANAGES_ZK=true
- 配置文件
hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>file:////Users/mubi/hadoop/data/hbase</value>
</property>
//或者
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
- 环境变量
export HBASE_HOME=/Users/mubi/hadoop/hbase-2.2.6
export PATH=$PATH:$HBASE_HOME/bin
https://hbase.apache.org/book.html#datamodel
- 命令
./bin/start-hbase.sh
./bin/stop-hbase.sh
hbase shell
- 正常启动
mubi@mubideMacBook-Pro hbase-2.2.6 $ hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/mubi/hadoop/hadoop-2.7.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/mubi/hadoop/hbase-2.2.6/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.2.6, r88c9a386176e2c2b5fd9915d0e9d3ce17d0e456e, Tue Sep 15 17:36:14 CST 2020
Took 0.0027 seconds
hbase(main):001:0> list
TABLE
0 row(s)
Took 0.4003 seconds
=> []
hbase(main):002:0>
Hbase 基础
Hbase 架构
读流程
- 先获取元数据,知晓数据的存储位置
- 然后发起真正的读内存/磁盘操作
写流程
写操作是写hlog:先内存缓冲区,然后会持久化到磁盘
Flush 刷写
hbase-default.xml配置文件
- hbase.regionserver.global.memorystore.size: 触发刷写到storefile的整个RegionServer最大内存,默认是堆的40%
- optionalcacheflushinterval: RegionServer中任一Region的MemoryStore时间间隔达到该值,触发刷写,默认1小时
这两个刷写机制会触发整个RegionServer的所有MemoryStore刷写
- hbase.hregion.memstore.flush.size: 单个region的memory store达到某个上限,会触发该memory store刷写,默认128MB
Compacy 合并小文件
因为可能有一些memory store数据量很少的时候被刷写,因此可能存在刷写到磁盘的小文件,这就需要定时进行合并
- hbase.hregion.majorcompaction: 默认是7天,但该操作非常耗资源,因此生产环境下应该关闭,空闲时手动打开
- hbase.hstore.compactionThreshold: 当一个region的storeFile个数超过一定数量,自动进行合并,默认是3
Hbase 数据模型
表的各种基础操作
创建并显示表
hbase(main):002:0> create 'student','info','course'
Created table student
Took 1.3815 seconds
=> Hbase::Table - student
hbase(main):003:0>
hbase(main):004:0* list
TABLE
student
1 row(s)
Took 0.0295 seconds
=> ["student"]
hbase(main):005:0>
修改/查看表结构
hbase(main):002:0> alter 'student','NAME'=>'course','VERSIONS'=>'3'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.2785 seconds
hbase(main):003:0> desc 'student'
Table student is ENABLED
student
COLUMN FAMILIES DESCRIPTION
NAME => 'course', VERSIONS => '3', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION
_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'fals
e', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICA
TION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMO
RY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'fals
e', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'
NAME => 'info', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_B
EHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false'
, DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATI
ON_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY
=> 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false'
, COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'
2 row(s)
QUOTAS
0 row(s)
Took 0.1168 seconds
hbase(main):004:0>
输入数据
info
列族:name、age、sex、dept
course
列族:english、math、physics
put 'student','201601','info:name','liu',4
put 'student','201601','info:age',15
put 'student','201601','info:sex','nv'
put 'student','201601','info:dept','PE'
put 'student','201602','info:name','wang'
put 'student','201602','info:age',16,7
put 'student','201602','info:sex','nan'
put 'student','201602','info:dept','PC'
put 'student','201603','info:name','sun',6
put 'student','201603','info:age',19
put 'student','201603','info:sex','nv'
put 'student','201603','info:dept','JAVA'
put 'student','201601','course:english',72,3
put 'student','201601','course:math',79
put 'student','201601','course:physics',82
put 'student','201602','course:english',62
put 'student','201602','course:math',68,8
put 'student','201602','course:physics',49
put 'student','201603','course:english',73,8
put 'student','201603','course:math',69
put 'student','201603','course:physics',48,6
get查看数据
hbase(main):004:0> get 'student','201601'
COLUMN CELL
course:english timestamp=3, value=72
course:math timestamp=1610862600400, value=79
course:physics timestamp=1610862683257, value=82
info:age timestamp=1610862599809, value=15
info:dept timestamp=1610862599915, value=PE
info:name timestamp=4, value=liu
info:sex timestamp=1610862599862, value=nv
1 row(s)
Took 0.0528 seconds
hbase(main):005:0> get 'student','201602'
COLUMN CELL
course:physics timestamp=1610862689724, value=49
info:age timestamp=7, value=16
info:dept timestamp=1610862600086, value=PC
info:name timestamp=1610862599977, value=wang
info:sex timestamp=1610862600046, value=nan
1 row(s)
Took 0.0257 seconds
hbase(main):006:0> get 'student','201603'
COLUMN CELL
course:english timestamp=8, value=73
course:math timestamp=1610862600605, value=69
course:physics timestamp=6, value=48
info:age timestamp=1610862600198, value=19
info:dept timestamp=1610862600282, value=JAVA
info:name timestamp=6, value=sun
info:sex timestamp=1610862600239, value=nv
1 row(s)
Took 0.0182 seconds
hbase(main):007:0>
put更新数据
hbase(main):006:0> get 'student','201603'
COLUMN CELL
course:english timestamp=8, value=73
course:math timestamp=1610862600605, value=69
course:physics timestamp=6, value=48
info:age timestamp=1610862600198, value=19
info:dept timestamp=1610862600282, value=JAVA
info:name timestamp=6, value=sun
info:sex timestamp=1610862600239, value=nv
1 row(s)
Took 0.0182 seconds
hbase(main):007:0> put 'student','201603','course:physics',60
Took 0.0085 seconds
hbase(main):008:0> get 'student','201603'
COLUMN CELL
course:english timestamp=8, value=73
course:math timestamp=1610862600605, value=69
course:physics timestamp=1610862848284, value=60
info:age timestamp=1610862600198, value=19
info:dept timestamp=1610862600282, value=JAVA
info:name timestamp=6, value=sun
info:sex timestamp=1610862600239, value=nv
1 row(s)
Took 0.0361 seconds
hbase(main):009:0>
get查询
hbase(main):012:0> get 'student','201603'
COLUMN CELL
course:english timestamp=8, value=73
course:math timestamp=1610862600605, value=69
course:physics timestamp=1610862848284, value=60
info:age timestamp=1610862600198, value=19
info:dept timestamp=1610862600282, value=JAVA
info:name timestamp=6, value=sun
info:sex timestamp=1610862600239, value=nv
1 row(s)
Took 0.0078 seconds
hbase(main):013:0> get 'student','201603',COLUMN=>'course',TIMERANGE=>[5,8]
COLUMN CELL
course:physics timestamp=6, value=48
1 row(s)
Took 0.0118 seconds
hbase(main):014:0>
hbase(main):014:0> get 'student','201603',COLUMN=>'course',TIMERANGE=>[7,8]
COLUMN CELL
0 row(s)
Took 0.0051 seconds
hbase(main):015:0>
scan查询
hbase(main):015:0> scan 'student',COLUMN => 'info:name'
ROW COLUMN+CELL
201601 column=info:name, timestamp=4, value=liu
201602 column=info:name, timestamp=1610862599977, value=wang
201603 column=info:name, timestamp=6, value=sun
3 row(s)
Took 0.0206 seconds
hbase(main):016:0> scan 'student',COLUMN => 'info:dept'
ROW COLUMN+CELL
201601 column=info:dept, timestamp=1610862599915, value=PE
201602 column=info:dept, timestamp=1610862600086, value=PC
201603 column=info:dept, timestamp=1610862600282, value=JAVA
3 row(s)
Took 0.0107 seconds
hbase(main):017:0>
hbase(main):027:0> scan 'student',COLUMN => 'course'
ROW COLUMN+CELL
201601 column=course:english, timestamp=3, value=72
201601 column=course:math, timestamp=1610862600400, value=79
201601 column=course:physics, timestamp=1610862683257, value=82
201602 column=course:physics, timestamp=1610862689724, value=49
201603 column=course:english, timestamp=8, value=73
201603 column=course:math, timestamp=1610862600605, value=69
201603 column=course:physics, timestamp=1610862848284, value=60
3 row(s)
Took 0.0120 seconds
过滤器
RowFilter
hbase(main):020:0> scan 'student',FILTER=>"RowFilter(=,'substring:2')"
ROW COLUMN+CELL
201601 column=course:english, timestamp=3, value=72
201601 column=course:math, timestamp=1610862600400, value=79
201601 column=course:physics, timestamp=1610862683257, value=82
201601 column=info:age, timestamp=1610862599809, value=15
201601 column=info:dept, timestamp=1610862599915, value=PE
201601 column=info:name, timestamp=4, value=liu
201601 column=info:sex, timestamp=1610862599862, value=nv
201602 column=course:physics, timestamp=1610862689724, value=49
201602 column=info:age, timestamp=7, value=hbase基础语法