HBase shell指令总结
Posted fightingnoob
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了HBase shell指令总结相关的知识,希望对你有一定的参考价值。
hbase是面向列的nosql,其指令较之传统关系型数据库是有所不同的,我们可以利用hbase shell命令行来熟悉hbase的基本指令。
首先进入hbase: $HBASE_HOME/bin/hbase shell
输入help指令,可以查看基本命令集合,一般常用的命令如下:
whoami 查用户 help查看基本命令集合 help command 查看命令帮助 list看库中所有表 status 查看当前运行服务器状态 version 版本查询 exists ‘表名字‘ 判断表存在 hbase shell中删除为 ctrl + backspace(单按删除键仅能删除掉当前光标所指以后的部分)
下面总结一些常用的hbase指令:
1.DDL部分
i. 【Namespace】
1)创建namespace
create_namespace ‘ns1‘
2)描述namespace
describe_namespace ‘ns1‘
3)删除namespace
drop_namespace ‘ns1‘
ii.【Table】
1)创建表(此处示例包含两个列簇)
create ‘ns1:t1‘,{NAME => ‘f1‘,VERSION => 5},{NAME => ‘f2‘}
或create ‘ns1:t1‘, ‘f1‘,‘f2‘
2)修改表结构 (禁能 -》修改-》 使能)
disable ‘t1‘
alter ‘t1‘,{NAME => ‘f1‘},{NAME => ‘f2‘,METHOD => ‘delete‘}
enable ‘t1‘
3)删除表 (禁能-》删除)
disable ‘t1‘
drop ‘t1‘
4)描述
describe ‘t1‘
此处注明一下hbase表结构属性(指定时属性名须大写):
NAME:列簇名
BLOOMFILTER: 过滤storefile,一般为下列三种:
ROW: 行键过滤
ROWCOL:行列过滤
NONE:无
VERSIONS:版本数
MIN_VERSIONS:最小版本数
TTL:版本存活时间,默认为FORVER
BLOCKSIZE:数据块大小
IN_MEMORY:赋予某些列簇具有较高的优先级(true/false)
BLOCKCACHE:数据块缓存,一般常用的列簇我们将其设定为true
2. DML部分
i.【put】
增加一行数据
Put a cell ‘value‘ at specified table/row/column and optionally timestamp coordinates. To put a cell value into table ‘ns1:t1‘ or ‘t1‘ at row ‘r1‘ under column ‘c1‘ marked with the time ‘ts1‘, do: hbase> put ‘ns1:t1‘, ‘r1‘, ‘c1‘, ‘value‘ hbase> put ‘t1‘, ‘r1‘, ‘c1‘, ‘value‘ hbase> put ‘t1‘, ‘r1‘, ‘c1‘, ‘value‘, ts1 hbase> put ‘t1‘, ‘r1‘, ‘c1‘, ‘value‘, {ATTRIBUTES=>{‘mykey‘=>‘myvalue‘}} hbase> put ‘t1‘, ‘r1‘, ‘c1‘, ‘value‘, ts1, {ATTRIBUTES=>{‘mykey‘=>‘myvalue‘}} hbase> put ‘t1‘, ‘r1‘, ‘c1‘, ‘value‘, ts1, {VISIBILITY=>‘PRIVATE|SECRET‘} The same commands also can be run on a table reference. Suppose you had a reference t to table ‘t1‘, the corresponding command would be: hbase> t.put ‘r1‘, ‘c1‘, ‘value‘, ts1, {ATTRIBUTES=>{‘mykey‘=>‘myvalue‘}}
ii. 【get】
获取一行数据,可以指定列簇,列,版本
Get row or cell contents; pass table name, row, and optionally a dictionary of column(s), timestamp, timerange and versions. Examples: hbase> get ‘ns1:t1‘, ‘r1‘ hbase> get ‘t1‘, ‘r1‘ hbase> get ‘t1‘, ‘r1‘, {TIMERANGE => [ts1, ts2]} hbase> get ‘t1‘, ‘r1‘, {COLUMN => ‘c1‘} hbase> get ‘t1‘, ‘r1‘, {COLUMN => [‘c1‘, ‘c2‘, ‘c3‘]} hbase> get ‘t1‘, ‘r1‘, {COLUMN => ‘c1‘, TIMESTAMP => ts1} hbase> get ‘t1‘, ‘r1‘, {COLUMN => ‘c1‘, TIMERANGE => [ts1, ts2], VERSIONS => 4} hbase> get ‘t1‘, ‘r1‘, {COLUMN => ‘c1‘, TIMESTAMP => ts1, VERSIONS => 4} hbase> get ‘t1‘, ‘r1‘, {FILTER => "ValueFilter(=, ‘binary:abc‘)"} hbase> get ‘t1‘, ‘r1‘, ‘c1‘ hbase> get ‘t1‘, ‘r1‘, ‘c1‘, ‘c2‘ hbase> get ‘t1‘, ‘r1‘, [‘c1‘, ‘c2‘] hbsase> get ‘t1‘,‘r1‘, {COLUMN => ‘c1‘, ATTRIBUTES => {‘mykey‘=>‘myvalue‘}} hbsase> get ‘t1‘,‘r1‘, {COLUMN => ‘c1‘, AUTHORIZATIONS => [‘PRIVATE‘,‘SECRET‘]}
iii. 【scan】
扫描全表,指定对应行的过滤条件并返回
Scan a table; pass table name and optionally a dictionary of scanner specifications. Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH, or COLUMNS, CACHE If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in ‘col_family:‘. The filter can be specified in two ways: 1. Using a filterString - more information on this is available in the Filter Language document attached to the HBASE-4176 JIRA 2. Using the entire package name of the filter. Some examples: hbase> scan ‘hbase:meta‘ hbase> scan ‘hbase:meta‘, {COLUMNS => ‘info:regioninfo‘} hbase> scan ‘ns1:t1‘, {COLUMNS => [‘c1‘, ‘c2‘], LIMIT => 10, STARTROW => ‘xyz‘} hbase> scan ‘t1‘, {COLUMNS => [‘c1‘, ‘c2‘], LIMIT => 10, STARTROW => ‘xyz‘} hbase> scan ‘t1‘, {COLUMNS => ‘c1‘, TIMERANGE => [1303668804, 1303668904]} hbase> scan ‘t1‘, {REVERSED => true} hbase> scan ‘t1‘, {FILTER => "(PrefixFilter (‘row2‘) AND (QualifierFilter (>=, ‘binary:xyz‘))) AND (TimestampsFilter ( 123, 456))"} hbase> scan ‘t1‘, {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)} For setting the Operation Attributes hbase> scan ‘t1‘, { COLUMNS => [‘c1‘, ‘c2‘], ATTRIBUTES => {‘mykey‘ => ‘myvalue‘}} hbase> scan ‘t1‘, { COLUMNS => [‘c1‘, ‘c2‘], AUTHORIZATIONS => [‘PRIVATE‘,‘SECRET‘]} For experts, there is an additional option -- CACHE_BLOCKS -- which switches block caching for the scanner on (true) or off (false). By default it is enabled. Examples: hbase> scan ‘t1‘, {COLUMNS => [‘c1‘, ‘c2‘], CACHE_BLOCKS => false} Also for experts, there is an advanced option -- RAW -- which instructs the scanner to return all cells (including delete markers and uncollected deleted cells). This option cannot be combined with requesting specific COLUMNS. Disabled by default. Example: hbase> scan ‘t1‘, {RAW => true, VERSIONS => 10} Besides the default ‘toStringBinary‘ format, ‘scan‘ supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the scan specification. The FORMATTER can be stipulated: 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString) 2. or as a custom class followed by method name: e.g. ‘c(MyFormatterClass).format‘. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> scan ‘t1‘, {COLUMNS => [‘cf:qualifier1:toInt‘, ‘cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt‘] }
示例:scan ‘stu_info‘,{COLUMN => ‘info:name‘,STARTROW => ‘20180525_10001‘,ENDROW => ‘20180525_10005‘}
iv.【count】
统计表中数据的记录数,其中默认为每1000行显示一次当前计数,默认缓存大小为10行,如果当前行大小较小,应该增加此项属性值
Count the number of rows in a table. Return value is the number of rows. This operation may take a LONG time (Run ‘$HADOOP_HOME/bin/hadoop jar hbase.jar rowcount‘ to run a counting mapreduce job). Current count is shown every 1000 rows by default. Count interval may be optionally specified. Scan caching is enabled on count scans by default. Default cache size is 10 rows. If your rows are small in size, you may want to increase this parameter. Examples: hbase> count ‘ns1:t1‘ hbase> count ‘t1‘ hbase> count ‘t1‘, INTERVAL => 100000 hbase> count ‘t1‘, CACHE => 1000 hbase> count ‘t1‘, INTERVAL => 10, CACHE => 1000
v.【deleteall】
删除行
Delete all cells in a given row; pass a table name, row, and optionally a column and timestamp. Examples: hbase> deleteall ‘ns1:t1‘, ‘r1‘ hbase> deleteall ‘t1‘, ‘r1‘ hbase> deleteall ‘t1‘, ‘r1‘, ‘c1‘ hbase> deleteall ‘t1‘, ‘r1‘, ‘c1‘, ts1 hbase> deleteall ‘t1‘, ‘r1‘, ‘c1‘, ts1, {VISIBILITY=>‘PRIVATE|SECRET‘}
其中不指定列,为删除整行
3. Region管理
i.【移动region】move
Move a region. Optionally specify target regionserver else we choose one at random. NOTE: You pass the encoded region name, not the region name so this command is a little different to the others. The encoded region name is the hash suffix on region names: e.g. if the region name were TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396 A server name is its host, port plus startcode. For example: host187.example.com,60020,1289493121758 Examples: hbase> move ‘ENCODED_REGIONNAME‘ hbase> move ‘ENCODED_REGIONNAME‘, ‘SERVER_NAME‘
ii.【开启/关闭region】
Enable/Disable balancer. Returns previous balancer state. Examples: hbase> balance_switch true hbase> balance_switch false
iii.【手动split】
Split entire table or pass a region to split individual region. With the second parameter, you can specify an explicit split key for the region. Examples: split ‘tableName‘ split ‘namespace:tableName‘ split ‘regionName‘ # format: ‘tableName,startKey,id‘ split ‘tableName‘, ‘splitKey‘ split ‘regionName‘, ‘splitKey‘
iv.【手动触发major compact】
Run major compaction on passed table or pass a region row to major compact an individual region. To compact a single column family within a region specify the region name followed by the column family name. Examples: Compact all regions in a table: hbase> major_compact ‘t1‘ hbase> major_compact ‘ns1:t1‘ Compact an entire region: hbase> major_compact ‘r1‘ Compact a single column family within a region: hbase> major_compact ‘r1‘, ‘c1‘ Compact a single column family within a table: hbase> major_compact ‘t1‘, ‘c1‘
此文主要由hbase shell help所得,仅做个人总结以便后续查看使用
以上是关于HBase shell指令总结的主要内容,如果未能解决你的问题,请参考以下文章
Hbase框架原理及相关的知识点理解Hbase访问MapReduceHbase访问Java APIHbase shell及Hbase性能优化总结