rows_merged 在压缩历史中是啥意思?
Posted
技术标签:
【中文标题】rows_merged 在压缩历史中是啥意思?【英文标题】:What does rows_merged mean in compactionhistory?rows_merged 在压缩历史中是什么意思? 【发布时间】:2015-02-18 10:04:14 【问题描述】:当我发布时
$ nodetool compactionhistory
我明白了
. . . compacted_at bytes_in bytes_out rows_merged
. . . 1404936947592 8096 7211 1:3, 3:1
1:3, 3:1
是什么意思?我能找到的唯一文档是 this,其中声明了
合并的分区数
这并没有解释为什么会有多个值以及冒号的含义。
【问题讨论】:
【参考方案1】:所以基本上它意味着 tables:rows 例如 1:3, 3:1 意味着 3 行取自一个 sstable (1:3) 和 1 行取自 3 (3:1) sstable,所有使该压缩操作中的一个稳定。
我自己试过了,所以这里有一个例子,希望对你有帮助:
创建键空间和表:
cqlsh> create keyspace space1 WITH replication = 'class': 'SimpleStrategy', 'replication_factor': 1;
cqlsh> create TABLE space1.tb1 ( key text, val1 text, primary KEY (key));
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key1','111');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key2','222');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key3','333');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key4','444');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key5','555');
cqlsh> exit
现在我们flush创建sstable
$ nodetool flush space1
我们看到只创建了一个版本的表
$ sudo ls -lR /var/lib/cassandra/data/space1
/var/lib/cassandra/data/space1:
total 4
drwxr-xr-x. 2 cassandra cassandra 4096 Feb 3 12:51 tb1
/var/lib/cassandra/data/space1/tb1:
total 32
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 12:51 space1-tb1-jb-1-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 146 Feb 3 12:51 space1-tb1-jb-1-Data.db
-rw-r--r--. 1 cassandra cassandra 24 Feb 3 12:51 space1-tb1-jb-1-Filter.db
-rw-r--r--. 1 cassandra cassandra 90 Feb 3 12:51 space1-tb1-jb-1-Index.db
-rw-r--r--. 1 cassandra cassandra 4389 Feb 3 12:51 space1-tb1-jb-1-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 12:51 space1-tb1-jb-1-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 12:51 space1-tb1-jb-1-TOC.txt
检查 sstable2json 我们看到的数据
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-1-Data.db
[
"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]],
"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]],
"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]],
"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]],
"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]
]
此时,“notetool compactionhistory”没有显示该表的任何内容,但让我们运行 compact 以查看我们得到的内容(向右滚动)
$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id keyspace_name columnfamily_name compacted_at bytes_in bytes_out rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b space1 tb1 1422968305305 146 146 1:5
现在让我们删除两行,然后刷新
cqlsh> delete from space1.tb1 where key='key1';
cqlsh> delete from space1.tb1 where key='key2';
cqlsh> exit
$ nodetool flush space1
$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
[sudo] password for datastax:
total 64
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 12:58 space1-tb1-jb-2-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 146 Feb 3 12:58 space1-tb1-jb-2-Data.db
-rw-r--r--. 1 cassandra cassandra 336 Feb 3 12:58 space1-tb1-jb-2-Filter.db
-rw-r--r--. 1 cassandra cassandra 90 Feb 3 12:58 space1-tb1-jb-2-Index.db
-rw-r--r--. 1 cassandra cassandra 4393 Feb 3 12:58 space1-tb1-jb-2-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 12:58 space1-tb1-jb-2-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 12:58 space1-tb1-jb-2-TOC.txt
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 13:02 space1-tb1-jb-3-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 49 Feb 3 13:02 space1-tb1-jb-3-Data.db
-rw-r--r--. 1 cassandra cassandra 16 Feb 3 13:02 space1-tb1-jb-3-Filter.db
-rw-r--r--. 1 cassandra cassandra 36 Feb 3 13:02 space1-tb1-jb-3-Index.db
-rw-r--r--. 1 cassandra cassandra 4413 Feb 3 13:02 space1-tb1-jb-3-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 13:02 space1-tb1-jb-3-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 13:02 space1-tb1-jb-3-TOC.txt
让我们检查表格内容
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-2-Data.db
[
"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]],
"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]],
"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]],
"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]],
"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]
]
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-3-Data.db
[
"key": "6b657931","metadata": "deletionInfo": "markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551,"columns": [],
"key": "6b657932","metadata": "deletionInfo": "markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553,"columns": []
]
现在让我们压缩
$ nodetool compact space1
现在只有一个马厩
$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
total 32
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 13:05 space1-tb1-jb-4-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 133 Feb 3 13:05 space1-tb1-jb-4-Data.db
-rw-r--r--. 1 cassandra cassandra 656 Feb 3 13:05 space1-tb1-jb-4-Filter.db
-rw-r--r--. 1 cassandra cassandra 90 Feb 3 13:05 space1-tb1-jb-4-Index.db
-rw-r--r--. 1 cassandra cassandra 4429 Feb 3 13:05 space1-tb1-jb-4-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 13:05 space1-tb1-jb-4-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 13:05 space1-tb1-jb-4-TOC.txt
现在让我们检查一下新马厩的内容,我们可以看到墓碑
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-4-Data.db
[
"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]],
"key": "6b657931","metadata": "deletionInfo": "markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551,"columns": [],
"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]],
"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]],
"key": "6b657932","metadata": "deletionInfo": "markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553,"columns": []
]
最后让我们检查压缩历史记录(向右滚动)
$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id keyspace_name columnfamily_name compacted_at bytes_in bytes_out rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b space1 tb1 1422968305305 146 146 1:5
46112600-aba5-11e4-9f73-351725b0ac5b space1 tb1 1422968706144 195 133 1:3, 2:2
【讨论】:
哇,这是一个绝妙的答案! 不能再同意了。谢谢!以上是关于rows_merged 在压缩历史中是啥意思?的主要内容,如果未能解决你的问题,请参考以下文章