MonetDB 带索引的慢查询
Posted
技术标签:
【中文标题】MonetDB 带索引的慢查询【英文标题】:MonetDB Slow Query with Index 【发布时间】:2014-03-15 21:59:24 【问题描述】:我创建了一个 10GB(60M 记录)的表,在数据插入后手动添加了唯一索引(hidden_id)。 我有最简单的查询,但需要一分钟才能完成。
select hidden_id from netflow where hidden_id = 350000;
而且查询花了几十分钟“select * from netflow order by hidden_id limit 12500 offset 212500;
”。
这真的让我很困惑。
我在下面发布对第一个查询的分析。知道为什么它这么慢吗?
trace select hidden_id from netflow where hidden_id = 350000;
+----------+------------------------------------------------------------------+
| ticks | stmt |
+==========+==================================================================+
| 3 | X_3 := sql.mvc(); |
| 15 | X_7=<tmp_2510>[69396995] := sql.bind(X_3=0,"sys","netflow","hidd |
: : en_id",0); :
| 227 | X_4:bat[:oid,:oid] =<tmp_13332>[69396995] := sql.tid(X_3=0,"sys" |
: : ,"netflow"); :
| 72978741 | X_36=<tmp_4053>[1] := algebra.subselect(X_7=<tmp_2510>[69396995] |
: : ,X_4=<tmp_13332>:bat[:oid,:oid][69396995],A0=350000:lng,A0=35000 :
: : 0:lng,true,true,false); :
| 17 | (X_10=<tmp_2175>[0],r1_10=<tmp_3416>[0]) := sql.bind(X_3=0,"sys" |
: : ,"netflow","hidden_id",2); :
| 14 | X_37=<tmp_13337>[0] := algebra.subselect(r1_10=<tmp_3416>[0],A0= |
: : 350000:lng,A0=350000:lng,true,true,false); :
| 6 | X_13=<tmp_3416>[0] := sql.bind(X_3=0,"sys","netflow","hidden_id" |
: : ,1); :
| 15 | X_38=<tmp_11053>[0] := algebra.subselect(X_13=<tmp_3416>[0],X_4= |
: : <tmp_13332>:bat[:oid,:oid][69396995],A0=350000:lng,A0=350000:lng :
: : ,true,true,false); :
| 4 | X_15=<tmp_4053>[1] := sql.subdelta(X_36=<tmp_4053>[1],X_4=<tmp_1 |
: : 3332>:bat[:oid,:oid][69396995],X_10=<tmp_2175>[0],X_37=<tmp_1333 :
: : 7>[0],X_38=<tmp_11053>[0]); :
| 20 | X_17=<tmp_11053>[1] := sql.projectdelta(X_15=<tmp_4053>[1],X_7=< |
: : tmp_2510>[69396995],X_10=<tmp_2175>[0],r1_10=<tmp_3416>[0],X_13= :
: : <tmp_3416>[0]); :
| 6 | X_18 := sql.resultSet(1,1,X_17=<tmp_11053>[1]); |
| 7 | sql.rsColumn(X_18=1,"sys.netflow","hidden_id","bigint",64,0,X_17 |
: : =<tmp_11053>[1]); :
| 2 | X_23 := io.stdout(); |
| 25 | sql.exportResult(X_23=="104d2":streams,X_18=1); |
| 1 | end s1_3; |
| 73011629 | X_5:void := user.s1_3(350000:lng); |
+----------+------------------------------------------------------------------+
这是正在创建的表。
CREATE TABLE "netflow" (
"time_seconds" double DEFAULT NULL,
"parsed_date" timestamp DEFAULT NULL,
"date_time_str" varchar(45) DEFAULT NULL,
"ip_layer_protocol" bigint DEFAULT NULL,
"ip_layer_protocol_code" varchar(45) DEFAULT NULL,
"first_seen_src_ip" varchar(45) DEFAULT NULL,
"first_seen_dest_ip" varchar(45) DEFAULT NULL,
"first_seen_src_port" bigint DEFAULT NULL,
"first_seen_dest_port" bigint DEFAULT NULL,
"more_fragments" varchar(45) DEFAULT NULL,
"cont_fragments" varchar(45) DEFAULT NULL,
"duration_seconds" bigint DEFAULT NULL,
"first_seen_src_payload_bytes" bigint DEFAULT NULL,
"first_seen_dest_payload_bytes" bigint DEFAULT NULL,
"first_seen_src_total_bytes" bigint DEFAULT NULL,
"first_seen_dest_total_bytes" bigint DEFAULT NULL,
"first_seen_src_packet_count" bigint DEFAULT NULL,
"first_seen_dest_packet_count" bigint DEFAULT NULL,
"record_force_out" varchar(45) DEFAULT NULL
);
更新:
平台:Windows 7 无并行
MonetDB 版本:MonetDB 5 server v11.15.19 "Feb2013-SP6"
存储表的描述:
select * from storage() where "table"='netflow';
+--------+---------+-------------------------------+-----------+----------+-----
-----+-----------+------------+------------+---------+--------+
| schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted |
+========+=========+===============================+===========+==========+==========+===========+============+============+=========+========+
| sys | netflow | time_seconds | double | 17\1711 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | parsed_date | timestamp | 20\2054 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | date_time_str | varchar | 07\734 | 69396995 | 21 | 277587980 | 2684354560 | 0 | false |
| sys | netflow | ip_layer_protocol | bigint | 62\6261 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | ip_layer_protocol_code | varchar | 62\6213 | 69396995 | 3 | 69396995 | 524288 | 0 | false |
| sys | netflow | first_seen_src_ip | varchar | 63\6342 | 69396995 | 11 | 138793990 | 524288 | 0 | false |
| sys | netflow | first_seen_dest_ip | varchar | 23\2324 | 69396995 | 8 | 138793990 | 524288 | 0 | false |
| sys | netflow | first_seen_src_port | bigint | 15\1574 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_port | bigint | 23\2370 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | more_fragments | varchar | 65\6521 | 69396995 | 1 | 69396995 | 524288 | 0 | false |
| sys | netflow | cont_fragments | varchar | 65\6524 | 69396995 | 1 | 69396995 | 524288 | 0 | false |
| sys | netflow | duration_seconds | bigint | 65\6560 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_src_payload_bytes | bigint | 65\6561 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_payload_bytes | bigint | 65\6562 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_src_total_bytes | bigint | 65\6563 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_total_bytes | bigint | 65\6564 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_src_packet_count | bigint | 65\6565 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_packet_count | bigint | 65\6566 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | record_force_out | varchar | 65\6567 | 69396995 | 1 | 69396995 | 524288 | 0 | false |
| sys | netflow | hidden_id | bigint | 25\2510 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | index_id | oid | 73\7375 | 69396995 | 8 | 555175960 | 0 | 0 | true |
+--------+---------+-------------------------------+-----------+----------+----------+-----------+------------+------------+---------+--------+
【问题讨论】:
你是如何添加 hidden_id 列的?另外:你用的是什么版本的莫奈?你自己编译的吗? 还有一件事:你能运行这个命令吗:select * from storage() where "table" = 'netflow';并发布结果? "更改表 ips_sample 添加 hidden_id bigint not null auto_increment;"。之后,我更新表以将其设置为唯一索引。我没有自己编译 MonetDB,我使用的是 Windows 安装程序版本。 【参考方案1】:请注意,SQL 语句 CREATE INDEX 在 MonetDB 中被静默忽略。 它将根据需要创建必要的哈希索引。 您的第一个查询会触发此操作。
我还注意到您似乎没有并行运行。 请务必注明使用的平台和 MonetDB 版本。
【讨论】:
我如何知道我是否在并行运行 monetDB 查询?以上是关于MonetDB 带索引的慢查询的主要内容,如果未能解决你的问题,请参考以下文章