带有大表的 Geoserver WFS + PostgreSQL 速度极慢
Posted
技术标签:
【中文标题】带有大表的 Geoserver WFS + PostgreSQL 速度极慢【英文标题】:Geoserver WFS + PostgreSQL with large table impossibly slow 【发布时间】:2019-11-19 15:31:01 【问题描述】:我有一个包含 950 万行的点位置的 PostgreSQL 表。我正在尝试运行此查询:
/geoserver/workspace/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=workspace:layer&maxFeatures=50&outputFormat=application/json
但需要几分钟才能回复。
当我查看地理服务器日志时,我看到:
Request: getFeature
service = WFS
version = 1.0.0
baseUrl = http://my_geoserver_url/geoserver/
query[0]:
typeName[0] = workspacelayer
maxFeatures = 50
outputFormat = application/json
resultType = results
19 Nov 15:26:04 INFO [wfs.json] - about to encode JSON
此时它会停顿很多分钟。
当我查看我的 PostgreSQL 服务器的当前活动查询时,我看到:
SELECT count(*) FROM "public"."layer"
这个查询本身需要 335 秒 来返回响应。首先,wtf?即使必须逐行计算,950 万行也不算多。有什么办法可以加快这个操作?
其次,它为什么要尝试做SELECT count(*) FROM "public"."layer"
,有什么办法可以阻止它?我指定了maxFeatures = 50,
那么为什么要计算它们呢?
版本:x86_64-pc-linux-gnu 上的 PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1),由 gcc 编译 (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bi
机器:n1-standard-2(2 个 vCPU,7.5 GB 内存)(来自 Google Cloud)
变量:
allow_system_table_mods off
application_name Navicat
archive_command (disabled)
archive_mode off
archive_timeout 0
array_nulls on
authentication_timeout 60
autovacuum on
autovacuum_analyze_scale_factor 0.1
autovacuum_analyze_threshold 50
autovacuum_freeze_max_age 200000000
autovacuum_max_workers 3
autovacuum_multixact_freeze_max_age 400000000
autovacuum_naptime 60
autovacuum_vacuum_cost_delay 20
autovacuum_vacuum_cost_limit -1
autovacuum_vacuum_scale_factor 0.2
autovacuum_vacuum_threshold 50
autovacuum_work_mem -1
backend_flush_after 0
backslash_quote safe_encoding
bgwriter_delay 200
bgwriter_flush_after 64
bgwriter_lru_maxpages 100
bgwriter_lru_multiplier 2
block_size 8192
bonjour off
bonjour_name
bytea_output hex
check_function_bodies on
checkpoint_completion_target 0.5
checkpoint_flush_after 32
checkpoint_timeout 300
checkpoint_warning 30
client_encoding UNICODE
client_min_messages notice
cluster_name
commit_delay 0
commit_siblings 5
config_file /var/lib/postgresql/data/postgresql.conf
constraint_exclusion partition
cpu_index_tuple_cost 0.005
cpu_operator_cost 0.0025
cpu_tuple_cost 0.01
cursor_tuple_fraction 0.1
data_checksums off
data_directory /var/lib/postgresql/data
data_directory_mode 0700
data_sync_retry off
DateStyle ISO, MDY
db_user_namespace off
deadlock_timeout 1000
debug_assertions off
debug_pretty_print on
debug_print_parse off
debug_print_plan off
debug_print_rewritten off
default_statistics_target 100
default_tablespace
default_text_search_config pg_catalog.english
default_transaction_deferrable off
default_transaction_isolation read committed
default_transaction_read_only off
default_with_oids off
dynamic_library_path $libdir
dynamic_shared_memory_type posix
effective_cache_size 524288
effective_io_concurrency 1
enable_bitmapscan on
enable_gathermerge on
enable_hashagg on
enable_hashjoin on
enable_indexonlyscan on
enable_indexscan on
enable_material on
enable_mergejoin on
enable_nestloop on
enable_parallel_append on
enable_parallel_hash on
enable_partition_pruning on
enable_partitionwise_aggregate off
enable_partitionwise_join off
enable_seqscan on
enable_sort on
enable_tidscan on
escape_string_warning on
event_source PostgreSQL
exit_on_error off
external_pid_file
extra_float_digits 0
force_parallel_mode off
from_collapse_limit 8
fsync on
full_page_writes on
geqo on
geqo_effort 5
geqo_generations 0
geqo_pool_size 0
geqo_seed 0
geqo_selection_bias 2
geqo_threshold 12
gin_fuzzy_search_limit 0
gin_pending_list_limit 4096
hba_file /var/lib/postgresql/data/pg_hba.conf
hot_standby on
hot_standby_feedback off
huge_pages try
ident_file /var/lib/postgresql/data/pg_ident.conf
idle_in_transaction_session_timeout 0
ignore_checksum_failure off
ignore_system_indexes off
integer_datetimes on
IntervalStyle postgres
jit off
jit_above_cost 100000
jit_debugging_support off
jit_dump_bitcode off
jit_expressions on
jit_inline_above_cost 500000
jit_optimize_above_cost 500000
jit_profiling_support off
jit_provider llvmjit
jit_tuple_deforming on
join_collapse_limit 8
krb_caseins_users off
krb_server_keyfile FILE:/etc/postgresql-common/krb5.keytab
lc_collate en_US.utf8
lc_ctype en_US.utf8
lc_messages en_US.utf8
lc_monetary en_US.utf8
lc_numeric en_US.utf8
lc_time en_US.utf8
listen_addresses *
lo_compat_privileges off
local_preload_libraries
lock_timeout 0
log_autovacuum_min_duration -1
log_checkpoints off
log_connections off
log_destination stderr
log_directory log
log_disconnections off
log_duration off
log_error_verbosity default
log_executor_stats off
log_file_mode 0600
log_filename postgresql-%Y-%m-%d_%H%M%S.log
log_hostname off
log_line_prefix %m [%p]
log_lock_waits off
log_min_duration_statement -1
log_min_error_statement error
log_min_messages warning
log_parser_stats off
log_planner_stats off
log_replication_commands off
log_rotation_age 1440
log_rotation_size 10240
log_statement none
log_statement_stats off
log_temp_files -1
log_timezone UTC
log_truncate_on_rotation off
logging_collector off
maintenance_work_mem 65536
max_connections 100
max_files_per_process 1000
max_function_args 100
max_identifier_length 63
max_index_keys 32
max_locks_per_transaction 64
max_logical_replication_workers 4
max_parallel_maintenance_workers 2
max_parallel_workers 8
max_parallel_workers_per_gather 2
max_pred_locks_per_page 2
max_pred_locks_per_relation -2
max_pred_locks_per_transaction 64
max_prepared_transactions 0
max_replication_slots 10
max_stack_depth 2048
max_standby_archive_delay 30000
max_standby_streaming_delay 30000
max_sync_workers_per_subscription 2
max_wal_senders 10
max_wal_size 1024
max_worker_processes 8
min_parallel_index_scan_size 64
min_parallel_table_scan_size 1024
min_wal_size 80
old_snapshot_threshold -1
operator_precedence_warning off
parallel_leader_participation on
parallel_setup_cost 1000
parallel_tuple_cost 0.1
password_encryption md5
port 5432
post_auth_delay 0
postgis.backend geos
pre_auth_delay 0
quote_all_identifiers off
random_page_cost 4
restart_after_crash on
row_security on
search_path public, "$user", public
segment_size 131072
seq_page_cost 1
server_encoding UTF8
server_version 11.2 (Debian 11.2-1.pgdg90+1)
server_version_num 110002
session_preload_libraries
session_replication_role origin
shared_buffers 16384
shared_preload_libraries
ssl off
ssl_ca_file
ssl_cert_file server.crt
ssl_ciphers HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file
ssl_dh_params_file
ssl_ecdh_curve prime256v1
ssl_key_file server.key
ssl_passphrase_command
ssl_passphrase_command_supports_reload off
ssl_prefer_server_ciphers on
standard_conforming_strings on
statement_timeout 0
stats_temp_directory pg_stat_tmp
superuser_reserved_connections 3
synchronize_seqscans on
synchronous_commit on
synchronous_standby_names
syslog_facility local0
syslog_ident postgres
syslog_sequence_numbers on
syslog_split_messages on
tcp_keepalives_count 9
tcp_keepalives_idle 7200
tcp_keepalives_interval 75
temp_buffers 1024
temp_file_limit -1
temp_tablespaces
TimeZone UTC
timezone_abbreviations Default
trace_notify off
trace_recovery_messages log
trace_sort off
track_activities on
track_activity_query_size 1024
track_commit_timestamp off
track_counts on
track_functions none
track_io_timing off
transaction_deferrable off
transaction_isolation read committed
transaction_read_only off
transform_null_equals off
unix_socket_directories /var/run/postgresql
unix_socket_group
unix_socket_permissions 0777
update_process_title on
vacuum_cleanup_index_scale_factor 0.1
vacuum_cost_delay 0
vacuum_cost_limit 200
vacuum_cost_page_dirty 20
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
vacuum_defer_cleanup_age 0
vacuum_freeze_min_age 50000000
vacuum_freeze_table_age 150000000
vacuum_multixact_freeze_min_age 5000000
vacuum_multixact_freeze_table_age 150000000
wal_block_size 8192
wal_buffers 512
wal_compression off
wal_consistency_checking
wal_keep_segments 0
wal_level replica
wal_log_hints off
wal_receiver_status_interval 10
wal_receiver_timeout 60000
wal_retrieve_retry_interval 5000
wal_segment_size 16777216
wal_sender_timeout 60000
wal_sync_method fdatasync
wal_writer_delay 200
wal_writer_flush_after 128
work_mem 4096
xmlbinary base64
xmloption content
zero_damaged_pages off
解释(分析、缓冲、格式化文本):
Buffers: shared hit=2940 read=1601607
-> Gather (cost=1655905.64..1655905.85 rows=2 width=8) (actual time=191335.534..191349.259 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=2940 read=1601607
-> Partial Aggregate (cost=1654905.64..1654905.65 rows=1 width=8) (actual time=191277.227..191277.227 rows=1 loops=3)
Buffers: shared hit=2940 read=1601607
-> Parallel Seq Scan on forecast (cost=0.00..1644833.91 rows=4028691 width=0) (actual time=8.037..190681.354 rows=3222220 loops=3)
Buffers: shared hit=2940 read=1601607
Planning Time: 0.351 ms
Execution Time: 191349.424 ms
索引:
CREATE INDEX forecast_init_utc ON public.forecast USING btree (forecast_init_utc)
CREATE UNIQUE INDEX forecast_pkey ON public.forecast USING btree (ogc_fid)
CREATE INDEX forecast_wkb_geometry_geom_idx ON public.forecast USING gist (wkb_geometry)
CREATE INDEX forecast_init_local ON public.forecast USING btree (forecast_init_local)
CREATE INDEX country_code ON public.forecast USING btree (country_code)
CREATE INDEX store_num ON public.forecast USING btree (store_num)
编辑:看来VACUUM ANALYZE
成功了。运行了 728 秒,但现在查询很快返回。我需要多久运行一次?
这是新的解释(分析、缓冲区、格式化文本):
Finalize Aggregate (cost=204220.77..204220.78 rows=1 width=8) (actual time=7116.754..7116.755 rows=1 loops=1)
Buffers: shared hit=525705 read=25977
-> Gather (cost=204220.55..204220.77 rows=2 width=8) (actual time=7115.613..7123.979 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=525705 read=25977
-> Partial Aggregate (cost=203220.55..203220.57 rows=1 width=8) (actual time=7051.177..7051.178 rows=1 loops=3)
Buffers: shared hit=525705 read=25977
-> Parallel Index Only Scan using store_num on forecast (cost=0.43..193125.85 rows=4037882 width=0) (actual time=18.684..6564.663 rows=3229577 loops=3)
Heap Fetches: 0
Buffers: shared hit=525705 read=25977
Planning Time: 0.896 ms
Execution Time: 7124.113 ms
【问题讨论】:
哪个 Postgres 版本?表layer
上是否有主键?那张桌子吸尘良好吗?您使用的是哪种硬件(磁盘类型、内存、CPU)?
请edit您的问题并添加使用explain (analyze, buffers, format text)
生成的查询的execution plan(不是 只是一个“简单”的解释)如formatted text 并确保您防止缩进计划。粘贴文本,然后将```
放在计划前一行和计划后一行。还请包括所有索引的完整 create index
语句。
PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, 由 gcc (Debian 6.3.0-18+deb9u1) 编译 6.3.0 20170516, 64-bit
我很惊讶 Postgres 没有选择 PK 索引上的扫描,它应该小于表上的 Seq Scan。在这样的环境中,吞吐量约为 60MB/秒。每个块平均有 4 行。这可以表示一个臃肿的表(vacuum full
可以解决这个问题)或一个非常宽的表。如果您在表上运行vacuum analyze
,Postgres 是否开始使用仅索引扫描(注意:没有full
选项)。
您应该配置您的系统,以使清理对所有表或仅对那个表更具侵略性。或者检查您是否有 idle in transaction
的会话,这些会话会阻止定期清理
【参考方案1】:
@a_horse_with_no_name 建议在表上运行vaccum analyze
产生了巨大的影响,但是有一个设置可以完全关闭这些select count(*)
查询。 , 它位于管理面板中的Layers -> <Layer Name> -> Publishing -> Skip the counting of the numberMatched attribute
。
【讨论】:
以上是关于带有大表的 Geoserver WFS + PostgreSQL 速度极慢的主要内容,如果未能解决你的问题,请参考以下文章
openlayers6结合geoserver利用WFS服务实现图层删除功能(附源码下载)
Geoserver 和 Openlayers - 在 WFS-T 中显示详细的消息错误
leaflet结合geoserver利用WFS服务实现图层删除功能(附源码下载)