PostgreSQL on Azure.cn : 性能测试及调优

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了PostgreSQL on Azure.cn : 性能测试及调优相关的知识,希望对你有一定的参考价值。

上一篇我们知道怎么一步步的安装并部署PostgreSQL,接下来我们就要测试一下在Azure上PostgreSQL可以达成什么样的性能,并且尝试修改数据库的参数,看看怎么优化数据库性能。

对于数据库新手来说,想要做数据库的性能测试遇到的第一个问题就是:我可以用什么测试工具?万幸的是安装了PostgreSQL后,我们可以使用自带的pgbench来进行性能测试。

测试所使用的虚拟机配置如下:

Standard DS4 v2 (8 vcpus, 28 GB memory), 4 X 1023G, 4 X 5000 IOPS, Raid0, 记做 VM-SSD

Standard DS4 v2 (8 vcpus, 28 GB memory), 16 X 128G, 16 X 500 IOPS, Raid0, 记做 VM-HDD

两台虚机放在同一个VNET,有对外发布的公网地址。

 

通过分别测试这两台虚拟机,我们来看看在各种情况下PostgreSQL的性能。

首先建立测试数据库,并创建5000万条的记录:

创建测试数据库

$ createdb pgbench

初始化测试数据库,会创建4张表

$ pgbench -i pgbench

插入5000万条记录

$ pgbench -i -s 500 pgbench

 

第一步测试:PostgreSQL安装部署完成,不作任何优化

 

VM-SSD:

$ pgbench -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 333504

latency average = 8.636 ms

tps = 1852.672083 (including connections establishing)

tps = 1852.742252 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.000  \set delta random(-5000, 5000)

         0.071  BEGIN;

         0.248  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE a id = :aid;

         0.150  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.171  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.265  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.117  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         7.607  END;

 

VM-HDD:

$ pgbench -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 223638

latency average = 12.879 ms

tps = 1242.328936 (including connections establishing)

tps = 1242.365821 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.000  \set delta random(-5000, 5000)

         0.070  BEGIN;

         0.250  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.150  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.174  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.331  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.116  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

        11.679  END;

 

可以看到使用SSD的虚机比使用HDD的虚机下性能好了接近50%,时延也差了近50%,差距还是很明显的。

 

第二步测试:优化PostgreSQL参数

修改数据库目录下的postgresql.conf文件

max_connections = 2000

shared_buffers = 4GB

temp_buffers = 2GB 

max_prepared_transactions = 2000

work_mem = 512MB            

maintenance_work_mem = 2GB

 

我们来挨个解释下这几个参数的含义:

max_connections : 决定和数据库连接的并发连接数目的最大值。 缺省通常是 100,但是如果你的内核设置不支持这么大(在 initdb 的时候判断),可能会比这个数少。这个参数的上限收到操作系统并发连接数的限制。修改本参数需要重启PostgreSQL服务。

shared_buffers:这是最重要的参数,postgresql对数据操作时都要先将数据从磁盘读取到内存中,然后进行更新,最后再将数据写回磁盘。因此应该尽量大,让更多的数据缓存在shared_buffers中。按照文档建议设置范围一般在实际内存的25%~40%之间。修改本参数需要重启PostgreSQL服务。

temp_buffers:临时缓冲区,用于数据库会话访问临时表数据,系统默认值为8M。可以在单独的session中对该参数进行设置,尤其是需要访问比较大的临时表时,将会有显著的性能提升。

max_prepared_transactions :设置可以同时处于"准备好"状态的事务的最大数目。 把这个参数设置为零则关闭准备好的事务的特性。max_prepared_transactions 设置成至少和 max_connections 一样大, 以避免在准备步骤的失败。 修改本参数需要重启PostgreSQL服务。

work_mem:工作内存或者操作内存。负责内部的sort和hash操作,合适的work_mem大小能够保证这些操作在内存中进行。定义太小的话,sort或者hash操作将需要与硬盘进行swap,这样会极大的降低系统的性能;太大的话致使在能够在内存中完成的操作数量减少,其他的部分需要与磁盘进行swap操作,增加IO降低性能。系统提供的默认值是1M,在实际的生产环境中,要对系统监控数据进行分析,作出最好的选择。对于work_mem内存分配时还要考虑数据库的并发情况,不论如何调整work_mem都要考虑max_connections*work_mem+shared_buffers+temp_buffers+maintenance_work_mem+操作系统所需内存不能够超过整个的RAM大小

maintenance_work_mem:维护工作内存,主要是针对数据库的维护操作或者语句。尽量的将这些操作在内存中进行。主要针对VACUUM,CREATE INDEX,REINDEX等操作。

 

可以看到修改的这几个参数的主要目的就是尽可能将所有的数据库和操作放在内存里进行,通过这种方式来提高性能。

 

经过参数调整,得到测试结果如下:

VM-SSD

$ pgbench -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 457226

latency average = 6.300 ms

tps = 2539.724092 (including connections establishing)

tps = 2539.832161 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.001  \set delta random(-5000, 5000)

         0.077  BEGIN;

         0.204  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.155  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.168  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.231  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.120  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         5.340  END;

 

VM-HDD

$ pgbench -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 313329

latency average = 9.193 ms

tps = 1740.403152 (including connections establishing)

tps = 1740.455620 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.000  \set delta random(-5000, 5000)

         0.073  BEGIN;

         0.195  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.155  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.169  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.273  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.118  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         8.208  END;

 

可以看到,经过参数优化,性能有了明显的提高,不过基于SSD的虚机性能还是远远好于基于HDD的虚机

 

第三步测试:Client/Server模式测试

之前的测试都是在虚拟机上,测试本机的性能。实际工作中,Client通常和数据库不在同一台服务器上,所以接下来就要测试下通过网络访问的性能。

上一篇说过,安装部署好的PostgreSQL默认只有本机可以访问,因此在开始测试前,先要配置允许对外服务,而且出于安全考虑,只有指定的IP地址可以访问数据库。

修改postgresql.conf, 设置监听地址为所有地址:listen_addresses = ‘0.0.0.0‘

修改pg_hba.conf,按提示的格式设置允许访问的IP地址,注意要加上掩码。

设置完成后重启PostgreSQL服务,开始测试。

 

首先测试通过VNET内网地址互相访问的结果。

VM-SSD 作为Client访问VM-HDD private address

$ pgbench -h 10.3.0.5 -p 1999 -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 340623

latency average = 8.456 ms

tps = 1892.056710 (including connections establishing)

tps = 1892.119631 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.001  \set delta random(-5000, 5000)

         0.308  BEGIN;

         0.432  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.335  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.348  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.412  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.290  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         6.329  END;

 

此时从VM-SDD ping VM-HDD的时延为:

64 bytes from 10.3.0.5: icmp_seq=1 ttl=64 time=0.459 ms

64 bytes from 10.3.0.5: icmp_seq=2 ttl=64 time=0.447 ms

64 bytes from 10.3.0.5: icmp_seq=3 ttl=64 time=0.226 ms

64 bytes from 10.3.0.5: icmp_seq=4 ttl=64 time=0.231 ms

64 bytes from 10.3.0.5: icmp_seq=5 ttl=64 time=0.264 ms

64 bytes from 10.3.0.5: icmp_seq=6 ttl=64 time=0.300 ms

64 bytes from 10.3.0.5: icmp_seq=7 ttl=64 time=0.260 ms

64 bytes from 10.3.0.5: icmp_seq=8 ttl=64 time=0.302 ms

64 bytes from 10.3.0.5: icmp_seq=9 ttl=64 time=0.226 ms

 

 

VM-HDD 作为Client访问VM-SSD private address

$ pgbench -h 10.3.0.4 -p 1999 -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 429534

latency average = 6.708 ms

tps = 2385.353651 (including connections establishing)

tps = 2385.425889 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.001  \set delta random(-5000, 5000)

         0.294  BEGIN;

         0.442  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.336  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.349  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.391  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.289  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         4.599  END;

 

此时从VM-SDD ping VM-HDD的时延为:

64 bytes from 10.3.0.4: icmp_seq=1 ttl=64 time=0.228 ms

64 bytes from 10.3.0.4: icmp_seq=2 ttl=64 time=0.219 ms

64 bytes from 10.3.0.4: icmp_seq=3 ttl=64 time=0.240 ms

64 bytes from 10.3.0.4: icmp_seq=4 ttl=64 time=0.353 ms

64 bytes from 10.3.0.4: icmp_seq=5 ttl=64 time=0.445 ms

64 bytes from 10.3.0.4: icmp_seq=6 ttl=64 time=0.278 ms

64 bytes from 10.3.0.4: icmp_seq=7 ttl=64 time=0.407 ms

64 bytes from 10.3.0.4: icmp_seq=8 ttl=64 time=0.196 ms

64 bytes from 10.3.0.4: icmp_seq=9 ttl=64 time=0.271 ms

 

从VNET内部访问的性能已经知道了,那么如果用公网地址互访的结果会怎么样呢?

 

VM-SSD 访问 VM-HDD public address

$ pgbench -h 139.219.110.250 -p 1999 -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 353836

latency average = 8.141 ms

tps = 1965.306199 (including connections establishing)

tps = 1965.368417 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.001  \set delta random(-5000, 5000)

         0.312  BEGIN;

         0.446  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.340  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.351  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.413  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.292  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         5.985  END;

 

VM-HDD 访问 VM-SSD public address

$ pgbench -h 139.219.96.68 -p 1999 -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 433479

latency average = 6.646 ms

tps = 2407.470097 (including connections establishing)

tps = 2407.548375 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.001  \set delta random(-5000, 5000)

         0.291  BEGIN;

         0.431  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.331  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.344  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.385  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.287  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         4.570  END;

 

总结测试结果如下表:

 

VM-SSD未优化

VM-HDD未优化

VM-SSD优化

VM-HDD优化

SSD to HDD Private IP

HDD to SSD Private IP

SSD to HDD Public IP

HDD to SSD Private IP

latency average

8.636 ms

12.879 ms

6.300 ms

9.193 ms

8.456 ms

6.708 ms

8.141 ms

6.646 ms

tps

1852

1242

2539

1740

1892

2385

1965

2407

 

其实仔细阅读了postgresql的文档后,可以发现work_mem是十分重要的一个参数,系统提供的默认值是1M,在实际的生产环境中,需要对系统监控数据进行分析,作出最好的选择。推荐使用以下两种方式:估计方法与计算方法。

第一种是可以根据业务量的大小和类型,一般语句运行时间,来粗略的估计一下。第二种方式是通过对数据库的监控,数据采集,然后计算其大小。总之合适的大小对系统的性能至关重要。
在实际的维护中可以通过explain analyze分析语句的work_mem大小是否合适。在语句中设置work_mem参数的大小可以充分利用内存,提高语句的执行效率。

work_mem参数对系统的性能是如此的重要,让其实时的适应数据库的运行状况显的不太可能,但是可以通过对数据库运行周期的监控,总结相应的数据,然后定制一个专用的脚本,专门用来修改work_mem的大小,使其阶段性的更加适应系统的状况,不失为一种好的方法。

 

最后补充一点,如果禁用 synchronous_commit参数,性能会有惊人的提高,不过这样关闭了日志的实时同步,虽然对性能有极大的提高,但是可能会造成数据库在意外时无法保持数据一致性,因此并不建议修改这个参数。

$ pgbench -r -c 16 -j 16 -n -T 180 pgbench

 

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 500

query mode: simple

number of clients: 16

number of threads: 16

duration: 180 s

number of transactions actually processed: 1667171

latency average = 1.728 ms

tps = 9260.473969 (including connections establishing)

tps = 9260.967802 (excluding connections establishing)

script statistics:

 - statement latencies in milliseconds:

         0.002  \set aid random(1, 100000 * :scale)

         0.001  \set bid random(1, 1 * :scale)

         0.001  \set tid random(1, 10 * :scale)

         0.001  \set delta random(-5000, 5000)

         0.064  BEGIN;

         0.209  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

         0.252  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

         0.315  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

         0.411  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

         0.317  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

         0.148  END;

 


以上是关于PostgreSQL on Azure.cn : 性能测试及调优的主要内容,如果未能解决你的问题,请参考以下文章

上手DocumentDB On Azure

Start/Stop PostgreSQL on Mac

postgresql ON CONFLICT ON CONSTRAINT 用于 2 个约束

PostgreSQL:带有约束名称的“ON CONFLICT”

GIS on CentOS 7 之 PostgreSQL & PostGIS

如何在 PostgreSQL 中使用 RETURNING 和 ON CONFLICT?