使用TPC-H 进行GreatSQL并行查询测试
Posted GreatSQL
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用TPC-H 进行GreatSQL并行查询测试相关的知识,希望对你有一定的参考价值。
准备工作
数据库版本
生成数据
使用 TPC-H 生成数据
#TPC-H Population Generator (Version 3.0.0)
#生成10G的数据
$ ./dbgen -vf -s 10
修改my.cnf
vim /etc/my.cnf
#设置IPB为8G
innodb_buffer_pool_size = 8G
#设置并行查询的使用最大内存(此处为8G,根据具体配置设置)
parallel_memory_limit= 8G
#打开并行查询
force_parallel_execute=1
#设置双1(方便导入数据)
innodb_flush_log_at_trx_commit = 1
sync_binlog = 1
#关闭binlog
skip-log_bin
datadir = /data/GreatSQL
socket = mysql.sock
启动数据库后,可以检查配置是否生效
mysql> show variables like \'%double%\';
mysql> show variables like \'log_bin\';
mysql> show variables like \'sync_binlog\';
mysql> show variables like \'innodb_flush_log_at_trx_commit\';
mysql> show variables like \'innodb_buffer_pool_size\';
并行查询相关参数
mysql> show global variables like \'%parall%\';
+----------------------------------+----------------+
| force_parallel_execute | ON |
| innodb_parallel_dblwr_encrypt | OFF |
| innodb_parallel_doublewrite_path | xb_doublewrite |
| innodb_parallel_read_threads | 4 |
| parallel_cost_threshold | 1000 |
| parallel_default_dop | 4 |
| parallel_max_threads | 64 |
| parallel_memory_limit | 8589934592 |
| parallel_queue_timeout | 0 |
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 2 |
+----------------------------------+----------------+
11 rows in set (0.01 sec)
启动数据库
启动数据库:
$ systemctl start greatsql.service
文件准备
本次的工作在/data/tpch
可执行程序为dbgen,依赖一个数据分布文件dists.dss。可以将dbgen和dists.dss拷贝到同一目录使用
dss.ddl
和 dss.ri
文件
准备表结构和索引文件 dss.ddl
和 dss.ri
到工作目录
$ cd /data/tpch/tpch_2.18.0/dbgen
$ cp dss.ri /data/tpch/
$ cp dss.ddl /data/tpch/
load.sql
修改 load.sql 文件 修改文件的路径
$ cd /data/tpch/
$ cp load.sql loadfix.sql
$ vim loadfix.sql
导入数据
$ /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock
#创建数据库
mysql> create database tpch;
mysql> use tpch;
#导入表结构
mysql> source /data/tpch/dss.ddl;
#导入数据(文件见附录)
mysql> sh loadfile
#导入索引、外键等
mysql> source /data/tpch/dssfix.ri
注:binlog要关再导入,否则binlog会爆
Multi-statement transaction required more than \'max_binlog_cache_size\' bytes of storage; increase this mysqld variable and try again
查看导入的表(20G数据)
mysql> select table_name,table_rows
from information_schema.tables
where table_name in
(\'customer\',\'lineitem\',\'nation\',\'orders\',\'part\',\'partsupp\',\'region\',\'supplier\');
+------------+------------+
| TABLE_NAME | TABLE_ROWS |
+------------+------------+
| region | 5 |
| nation | 25 |
| part | 3860136 |
| supplier | 197853 |
| customer | 2884322 |
| partsupp | 17084176 |
| orders | 29678499 |
| lineitem | 87786966 |
+------------+------------+
8 rows in set (0.00 sec)
编写并运行测试脚本
#测试脚本见附录
#这里是在tmux中运行,避免因为终端关闭导致测试终止
$ sh auto.sh&
#测试结束后在当前脚本的目录查看生成的日志
$ cat tpch-PQ-******.log
测试运行时,观察相关指标。
mysql> show global status like \'%PQ%\';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| PQ_memory_refused | 0 |
| PQ_memory_used | 0 |
| PQ_threads_refused | 0 |
| PQ_threads_running | 0 |
+--------------------+-------+
4 rows in set (0.00 sec)
mysql> show processlist;
mysql> explain for connection **;
测试结果
开启并行查询(16线程)的执行时间,与不开启并行查询的执行时间如下:
SQL1 | SQL3 | SQL5 | SQL6 | SQL10 | SQL12 | SQL19 | |
---|---|---|---|---|---|---|---|
PQ16 | 1m25.645s | 1m5.514s | 8m56.306s | 35.451s | 44.564s | 59.115s | 5.771s |
NOPQ | 6m1.724s | 5m19.083s | 37m42.078s | 2m16.331s | 1m57.998s | 2m39.672s | 24.907s |
注:本文章重点讲测试过程,具体的测试结果就不展开了。
**
**
附录-相关文件
导入脚本
$ cat loadfile
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/region.tbl\' into table region FIELDS TERMINATED BY \'|\';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/nation.tbl\' into table nation FIELDS TERMINATED BY \'|\';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/supplier.tbl\' into table supplier FIELDS TERMINATED BY \'|\';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/part.tbl\' into table part FIELDS TERMINATED BY \'|\';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/customer.tbl\' into table customer FIELDS TERMINATED BY \'|\';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/partsupp.tbl\' into table partsupp FIELDS TERMINATED BY \'|\';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/orders.tbl\' into table orders FIELDS TERMINATED BY \'|\';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile \'/data/tpch/data/lineitem.tbl\' into table lineitem FIELDS TERMINATED BY \'|\';" tpch &
测试脚本:
脚本是东拼西凑的,写的不好,希望有大佬能指导一下
$ cat auto.sh
#include <iostream>TH=$PATH:/usr/local/bin
export PATH
#set -u
#set -x
#set -e
. ~/.bash_profile > /dev/null 2>&1
exec 3>&1 4>&2 1>> tpch-PQ-`date +\'%Y%m%d%H%M%S\'`.log 2>&1
# 定义要执行的SQL文件存放的目录
SQL_DIR="/data/tpch/SQLs"
# 判断目录是否存在
if [ ! -d "$SQL_DIR" ]; then
echo "SQL文件目录不存在!"
exit 1
fi
# 进入SQL文件目录
cd $SQL_DIR
I=1
II=3
while [ $I -le $II ]
do
# 执行SQL文件
for file in `ls *.sql`
do
echo "正在执行:$file"
time /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock -Dtpch < $file
echo "SQL:$file,执行完成"
echo -e
echo "休息100s"
sleep 100
echo -e
done
echo "第$I次循环执行完成!"
I=`expr $I + 1`
done
echo "脚本结束"
dss.ddl
-- Sccsid: @(#)dss.ddl 2.1.8.1
drop database tpch;
create database tpch;
use tpch;
CREATE TABLE NATION ( N_NATIONKEY INTEGER NOT NULL,
N_NAME CHAR(25) NOT NULL,
N_REGIONKEY INTEGER NOT NULL,
N_COMMENT VARCHAR(152));
CREATE TABLE REGION ( R_REGIONKEY INTEGER NOT NULL,
R_NAME CHAR(25) NOT NULL,
R_COMMENT VARCHAR(152));
CREATE TABLE PART ( P_PARTKEY INTEGER NOT NULL,
P_NAME VARCHAR(55) NOT NULL,
P_MFGR CHAR(25) NOT NULL,
P_BRAND CHAR(10) NOT NULL,
P_TYPE VARCHAR(25) NOT NULL,
P_SIZE INTEGER NOT NULL,
P_CONTAINER CHAR(10) NOT NULL,
P_RETAILPRICE DECIMAL(15,2) NOT NULL,
P_COMMENT VARCHAR(23) NOT NULL );
CREATE TABLE SUPPLIER ( S_SUPPKEY INTEGER NOT NULL,
S_NAME CHAR(25) NOT NULL,
S_ADDRESS VARCHAR(40) NOT NULL,
S_NATIONKEY INTEGER NOT NULL,
S_PHONE CHAR(15) NOT NULL,
S_ACCTBAL DECIMAL(15,2) NOT NULL,
S_COMMENT VARCHAR(101) NOT NULL);
CREATE TABLE PARTSUPP ( PS_PARTKEY INTEGER NOT NULL,
PS_SUPPKEY INTEGER NOT NULL,
PS_AVAILQTY INTEGER NOT NULL,
PS_SUPPLYCOST DECIMAL(15,2) NOT NULL,
PS_COMMENT VARCHAR(199) NOT NULL );
CREATE TABLE CUSTOMER ( C_CUSTKEY INTEGER NOT NULL,
C_NAME VARCHAR(25) NOT NULL,
C_ADDRESS VARCHAR(40) NOT NULL,
C_NATIONKEY INTEGER NOT NULL,
C_PHONE CHAR(15) NOT NULL,
C_ACCTBAL DECIMAL(15,2) NOT NULL,
C_MKTSEGMENT CHAR(10) NOT NULL,
C_COMMENT VARCHAR(117) NOT NULL);
CREATE TABLE ORDERS ( O_ORDERKEY INTEGER NOT NULL,
O_CUSTKEY INTEGER NOT NULL,
O_ORDERSTATUS CHAR(1) NOT NULL,
O_TOTALPRICE DECIMAL(15,2) NOT NULL,
O_ORDERDATE DATE NOT NULL,
O_ORDERPRIORITY CHAR(15) NOT NULL,
O_CLERK CHAR(15) NOT NULL,
O_SHIPPRIORITY INTEGER NOT NULL,
O_COMMENT VARCHAR(79) NOT NULL);
CREATE TABLE LINEITEM ( L_ORDERKEY INTEGER NOT NULL,
L_PARTKEY INTEGER NOT NULL,
L_SUPPKEY INTEGER NOT NULL,
L_LINENUMBER INTEGER NOT NULL,
L_QUANTITY DECIMAL(15,2) NOT NULL,
L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL,
L_DISCOUNT DECIMAL(15,2) NOT NULL,
L_TAX DECIMAL(15,2) NOT NULL,
L_RETURNFLAG CHAR(1) NOT NULL,
L_LINESTATUS CHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NULL,
L_COMMITDATE DATE NOT NULL,
L_RECEIPTDATE DATE NOT NULL,
L_SHIPINSTRUCT CHAR(25) NOT NULL,
L_SHIPMODE CHAR(10) NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL);
dss.ri
-- Sccsid: @(#)dss.ri 2.1.8.1
-- tpch Benchmark Version 8.0
-- For table REGION
ALTER TABLE tpch.REGION
ADD PRIMARY KEY (R_REGIONKEY);
-- For table NATION
ALTER TABLE tpch.NATION
ADD PRIMARY KEY (N_NATIONKEY);
ALTER TABLE tpch.NATION
ADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references tpch.REGION(R_REGIONKEY);
COMMIT WORK;
-- For table PART
ALTER TABLE tpch.PART
ADD PRIMARY KEY (P_PARTKEY);
COMMIT WORK;
-- For table SUPPLIER
ALTER TABLE tpch.SUPPLIER
ADD PRIMARY KEY (S_SUPPKEY);
ALTER TABLE tpch.SUPPLIER
ADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references tpch.NATION(N_NATIONKEY);
COMMIT WORK;
-- For table PARTSUPP
ALTER TABLE tpch.PARTSUPP
ADD PRIMARY KEY (PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;
-- For table CUSTOMER
ALTER TABLE tpch.CUSTOMER
ADD PRIMARY KEY (C_CUSTKEY);
ALTER TABLE tpch.CUSTOMER
ADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references tpch.NATION(N_NATIONKEY);
COMMIT WORK;
-- For table LINEITEM
ALTER TABLE tpch.LINEITEM
ADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);
COMMIT WORK;
-- For table ORDERS
ALTER TABLE tpch.ORDERS
ADD PRIMARY KEY (O_ORDERKEY);
COMMIT WORK;
-- For table PARTSUPP
ALTER TABLE tpch.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references tpch.SUPPLIER(S_SUPPKEY);
COMMIT WORK;
ALTER TABLE tpch.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references tpch.PART(P_PARTKEY);
COMMIT WORK;
-- For table ORDERS
ALTER TABLE tpch.ORDERS
ADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references tpch.CUSTOMER(C_CUSTKEY);
COMMIT WORK;
-- For table LINEITEM
ALTER TABLE tpch.LINEITEM
ADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY) references tpch.ORDERS(O_ORDERKEY);
COMMIT WORK;
ALTER TABLE tpch.LINEITEM
ADD FOREIGN KEY LINEITEM_FK2 (L_PARTKEY,L_SUPPKEY) references
tpch.PARTSUPP(PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;
SQL语句
--SQL1
select /*+ PQ(16) */
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date \'1998-12-01\' - interval \'88\' day
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus limit 1;
--SQL3
select /*+ PQ(16) */
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = \'MACHINERY\'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < date \'1995-03-01\'
and l_shipdate > date \'1995-03-01\'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate limit 10;
--SQL6
select /*+ PQ(16) */
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= date \'1993-01-01\'
and l_shipdate < date \'1993-01-01\' + interval \'1\' year
and l_discount between 0.02 - 0.01 and 0.02 + 0.01
and l_quantity < 24 limit 1;
--SQL10
select /*+ PQ(16) */
c_custkey,
c_name,
sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
from
customer,
orders,
lineitem,
nation
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= date \'1994-05-01\'
and o_orderdate < date \'1994-05-01\' + interval \'3\' month
and l_returnflag = \'R\'
and c_nationkey = n_nationkey
group by
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
order by
revenue desc limit 20;
--SQL12
select /*+ PQ(16) */
l_shipmode,
sum(case
when o_orderpriority = \'1-URGENT\'
or o_orderpriority = \'2-HIGH\'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> \'1-URGENT\'
and o_orderpriority <> \'2-HIGH\'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in (\'TRUCK\', \'FOB\')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date \'1996-01-01\'
and l_receiptdate < date \'1996-01-01\' + interval \'1\' year
group by
l_shipmode
order by
l_shipmode limit 1 ;
--SQL19
select /*+ PQ(16) */
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = \'Brand#12\'
and p_container in (\'SM CASE\', \'SM BOX\', \'SM PACK\', \'SM PKG\')
and l_quantity >= 10 and l_quantity <= 10 + 10
and p_size between 1 and 5
and l_shipmode in (\'AIR\', \'AIR REG\')
and l_shipinstruct = \'DELIVER IN PERSON\'
)
or
(
p_partkey = l_partkey
and p_brand = \'Brand#22\'
and p_container in (\'MED BAG\', \'MED BOX\', \'MED PKG\', \'MED PACK\')
and l_quantity >= 15 and l_quantity <= 15 + 10
and p_size between 1 and 10
and l_shipmode in (\'AIR\', \'AIR REG\')
and l_shipinstruct = \'DELIVER IN PERSON\'
)
or
(
p_partkey = l_partkey
and p_brand = \'Brand#43\'
and p_container in (\'LG CASE\', \'LG BOX\', \'LG PACK\', \'LG PKG\')
and l_quantity >= 22 and l_quantity <= 22 + 10
and p_size between 1 and 15
and l_shipmode in (\'AIR\', \'AIR REG\')
and l_shipinstruct = \'DELIVER IN PERSON\'
) limit 1 ;
相关问题
参考资料
Enjoy GreatSQL
GreatSQL 8.0.25-17今日发布
1.新增特性
2.稳定性提升
3.其他调整
4.bug修复
5. GreatSQL VS MySQL社区版
6. GreatSQL Release Notes
Changes in GreatSQL 8.0.25-17(2023-3-13)
GreatSQL 8.0.25-17 是一个微小改进版本,主要是修复GreatSQL中InnoDB并行查询可能导致查询hang住,甚至crash的问题,其他方面和GreatSQL 8.0.25-16版本是一样的。
直播预告:今晚老叶会上线和大家一起聊聊为什么这次新版本背后的故事,以及下个新版本的计划。
扫码预约发布会或点击底部阅读原文预约:
1.新增特性
2.稳定性提升
3.其他调整
4.bug修复
修复GreatSQL中InnoDB并行查询可能导致查询hang住,甚至crash的问题。
5. GreatSQL VS MySQL社区版
特性 | GreatSQL 8.0.25-17 | MySQL 8.0.25 社区版 |
---|---|---|
投票节点/仲裁节点 | ✅ | ❎ |
快速单主模式 | ✅ | ❎ |
地理标签 | ✅ | ❎ |
全新流控算法 | ✅ | ❎ |
InnoDB并行查询优化 | ✅ | ❎ |
线程池(Thread Pool) | ✅ | ❎ |
审计 | ✅ | ❎ |
InnoDB事务锁优化 | ✅ | ❎ |
SEQUENCE_TABLE(N)函数 | ✅ | ❎ |
InnoDB表损坏异常处理 | ✅ | ❎ |
强制只能使用InnoDB引擎表 | ✅ | ❎ |
杀掉空闲事务,避免长时间锁等待 | ✅ | ❎ |
Data Masking(数据脱敏/打码) | ✅ | ❎ |
InnoDB碎片页统计增强 | ✅ | ❎ |
支持MyRocks引擎 | ✅ | ❎ |
InnoDB I/O性能提升 | ⭐️⭐️⭐️⭐️⭐️ | ⭐️⭐️ |
网络分区异常应对 | ⭐️⭐️⭐️⭐️⭐️ | ⭐️ |
完善节点异常退出处理 | ⭐️⭐️⭐️⭐️⭐️ | ⭐️ |
一致性读性能 | ⭐️⭐️⭐️⭐️⭐️ | ⭐️ |
提升MGR吞吐量 | ⭐️⭐️⭐️⭐️⭐️ | ⭐️ |
统计信息增强 | ⭐️⭐️⭐️⭐️⭐️ | ⭐️ |
slow log增强 | ⭐️⭐️⭐️⭐️⭐️ | ⭐️ |
大事务处理 | ⭐️⭐️⭐️⭐️ | ⭐️ |
修复多写模式下可能丢数据风险 | ⭐️⭐️⭐️⭐️⭐️ | / |
修复单主模式下切主丢数据风险 | ⭐️⭐️⭐️⭐️⭐️ | / |
MGR集群启动效率提升 | ⭐️⭐️⭐️⭐️⭐️ | / |
集群节点磁盘满处理 | ⭐️⭐️⭐️⭐️⭐️ | / |
修复TCP self-connect问题 | ⭐️⭐️⭐️⭐️⭐️ | / |
PROCESSLIST增强 | ⭐️⭐️⭐️⭐️⭐️ | / |
6. GreatSQL Release Notes
Changes in GreatSQL 8.0.25-17 (2023-3-13)
(https://gitee.com/GreatSQL/GreatSQL-Doc/blob/master/relnotes/changes-greatsql-8-0-25-17-20230313.md)
Enjoy GreatSQL :)
《零基础学习MySQL》视频课程
戳此小程序即可直达B站
https://www.bilibili.com/video/BV1Da411W7Va?
文章推荐:
关于 GreatSQL
GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。
GreatSQL社区官网: https://greatsql.cn/
Gitee: https://gitee.com/GreatSQL/GreatSQL
GitHub: https://github.com/GreatSQL/GreatSQL
Bilibili:https://space.bilibili.com/1363850082
社区有奖建议反馈:
https://greatsql.cn/thread-54-1-1.html
社区博客有奖征稿详情:
https://greatsql.cn/thread-100-1-1.html
(对文章有疑问或者有独到见解都可以去社区官网提出或分享哦~)
微信&QQ群:
可扫码添加GreatSQL社区助手微信好友,发送验证信息“加群”加入GreatSQL/MGR交流微信群,亦可直接扫码加入GreatSQL/MGR交流QQ群。
微信
想看更多技术好文,点个“在看”吧!
以上是关于使用TPC-H 进行GreatSQL并行查询测试的主要内容,如果未能解决你的问题,请参考以下文章