带有大表的 Geoserver WFS + PostgreSQL 速度极慢

Posted

技术标签:

【中文标题】带有大表的 Geoserver WFS + PostgreSQL 速度极慢【英文标题】:Geoserver WFS + PostgreSQL with large table impossibly slow 【发布时间】:2019-11-19 15:31:01 【问题描述】:

我有一个包含 950 万行的点位置的 PostgreSQL 表。我正在尝试运行此查询: /geoserver/workspace/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=workspace:layer&maxFeatures=50&outputFormat=application/json 但需要几分钟才能回复。

当我查看地理服务器日志时,我看到:

Request: getFeature
    service = WFS
    version = 1.0.0
    baseUrl = http://my_geoserver_url/geoserver/
    query[0]:
        typeName[0] = workspacelayer
    maxFeatures = 50
    outputFormat = application/json
    resultType = results
19 Nov 15:26:04 INFO [wfs.json] - about to encode JSON

此时它会停顿很多分钟。

当我查看我的 PostgreSQL 服务器的当前活动查询时,我看到: SELECT count(*) FROM "public"."layer"

这个查询本身需要 335 秒 来返回响应。首先,wtf?即使必须逐行计算,950 万行也不算多。有什么办法可以加快这个操作?

其次,它为什么要尝试做SELECT count(*) FROM "public"."layer",有什么办法可以阻止它?我指定了maxFeatures = 50, 那么为什么要计算它们呢?

版本:x86_64-pc-linux-gnu 上的 PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1),由 gcc 编译 (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bi

机器:n1-standard-2(2 个 vCPU,7.5 GB 内存)(来自 Google Cloud)

变量:

allow_system_table_mods off
application_name    Navicat
archive_command (disabled)
archive_mode    off
archive_timeout 0
array_nulls on
authentication_timeout  60
autovacuum  on
autovacuum_analyze_scale_factor 0.1
autovacuum_analyze_threshold    50
autovacuum_freeze_max_age   200000000
autovacuum_max_workers  3
autovacuum_multixact_freeze_max_age 400000000
autovacuum_naptime  60
autovacuum_vacuum_cost_delay    20
autovacuum_vacuum_cost_limit    -1
autovacuum_vacuum_scale_factor  0.2
autovacuum_vacuum_threshold 50
autovacuum_work_mem -1
backend_flush_after 0
backslash_quote safe_encoding
bgwriter_delay  200
bgwriter_flush_after    64
bgwriter_lru_maxpages   100
bgwriter_lru_multiplier 2
block_size  8192
bonjour off
bonjour_name    
bytea_output    hex
check_function_bodies   on
checkpoint_completion_target    0.5
checkpoint_flush_after  32
checkpoint_timeout  300
checkpoint_warning  30
client_encoding UNICODE
client_min_messages notice
cluster_name    
commit_delay    0
commit_siblings 5
config_file /var/lib/postgresql/data/postgresql.conf
constraint_exclusion    partition
cpu_index_tuple_cost    0.005
cpu_operator_cost   0.0025
cpu_tuple_cost  0.01
cursor_tuple_fraction   0.1
data_checksums  off
data_directory  /var/lib/postgresql/data
data_directory_mode 0700
data_sync_retry off
DateStyle   ISO, MDY
db_user_namespace   off
deadlock_timeout    1000
debug_assertions    off
debug_pretty_print  on
debug_print_parse   off
debug_print_plan    off
debug_print_rewritten   off
default_statistics_target   100
default_tablespace  
default_text_search_config  pg_catalog.english
default_transaction_deferrable  off
default_transaction_isolation   read committed
default_transaction_read_only   off
default_with_oids   off
dynamic_library_path    $libdir
dynamic_shared_memory_type  posix
effective_cache_size    524288
effective_io_concurrency    1
enable_bitmapscan   on
enable_gathermerge  on
enable_hashagg  on
enable_hashjoin on
enable_indexonlyscan    on
enable_indexscan    on
enable_material on
enable_mergejoin    on
enable_nestloop on
enable_parallel_append  on
enable_parallel_hash    on
enable_partition_pruning    on
enable_partitionwise_aggregate  off
enable_partitionwise_join   off
enable_seqscan  on
enable_sort on
enable_tidscan  on
escape_string_warning   on
event_source    PostgreSQL
exit_on_error   off
external_pid_file   
extra_float_digits  0
force_parallel_mode off
from_collapse_limit 8
fsync   on
full_page_writes    on
geqo    on
geqo_effort 5
geqo_generations    0
geqo_pool_size  0
geqo_seed   0
geqo_selection_bias 2
geqo_threshold  12
gin_fuzzy_search_limit  0
gin_pending_list_limit  4096
hba_file    /var/lib/postgresql/data/pg_hba.conf
hot_standby on
hot_standby_feedback    off
huge_pages  try
ident_file  /var/lib/postgresql/data/pg_ident.conf
idle_in_transaction_session_timeout 0
ignore_checksum_failure off
ignore_system_indexes   off
integer_datetimes   on
IntervalStyle   postgres
jit off
jit_above_cost  100000
jit_debugging_support   off
jit_dump_bitcode    off
jit_expressions on
jit_inline_above_cost   500000
jit_optimize_above_cost 500000
jit_profiling_support   off
jit_provider    llvmjit
jit_tuple_deforming on
join_collapse_limit 8
krb_caseins_users   off
krb_server_keyfile  FILE:/etc/postgresql-common/krb5.keytab
lc_collate  en_US.utf8
lc_ctype    en_US.utf8
lc_messages en_US.utf8
lc_monetary en_US.utf8
lc_numeric  en_US.utf8
lc_time en_US.utf8
listen_addresses    *
lo_compat_privileges    off
local_preload_libraries 
lock_timeout    0
log_autovacuum_min_duration -1
log_checkpoints off
log_connections off
log_destination stderr
log_directory   log
log_disconnections  off
log_duration    off
log_error_verbosity default
log_executor_stats  off
log_file_mode   0600
log_filename    postgresql-%Y-%m-%d_%H%M%S.log
log_hostname    off
log_line_prefix %m [%p] 
log_lock_waits  off
log_min_duration_statement  -1
log_min_error_statement error
log_min_messages    warning
log_parser_stats    off
log_planner_stats   off
log_replication_commands    off
log_rotation_age    1440
log_rotation_size   10240
log_statement   none
log_statement_stats off
log_temp_files  -1
log_timezone    UTC
log_truncate_on_rotation    off
logging_collector   off
maintenance_work_mem    65536
max_connections 100
max_files_per_process   1000
max_function_args   100
max_identifier_length   63
max_index_keys  32
max_locks_per_transaction   64
max_logical_replication_workers 4
max_parallel_maintenance_workers    2
max_parallel_workers    8
max_parallel_workers_per_gather 2
max_pred_locks_per_page 2
max_pred_locks_per_relation -2
max_pred_locks_per_transaction  64
max_prepared_transactions   0
max_replication_slots   10
max_stack_depth 2048
max_standby_archive_delay   30000
max_standby_streaming_delay 30000
max_sync_workers_per_subscription   2
max_wal_senders 10
max_wal_size    1024
max_worker_processes    8
min_parallel_index_scan_size    64
min_parallel_table_scan_size    1024
min_wal_size    80
old_snapshot_threshold  -1
operator_precedence_warning off
parallel_leader_participation   on
parallel_setup_cost 1000
parallel_tuple_cost 0.1
password_encryption md5
port    5432
post_auth_delay 0
postgis.backend geos
pre_auth_delay  0
quote_all_identifiers   off
random_page_cost    4
restart_after_crash on
row_security    on
search_path public, "$user", public
segment_size    131072
seq_page_cost   1
server_encoding UTF8
server_version  11.2 (Debian 11.2-1.pgdg90+1)
server_version_num  110002
session_preload_libraries   
session_replication_role    origin
shared_buffers  16384
shared_preload_libraries    
ssl off
ssl_ca_file 
ssl_cert_file   server.crt
ssl_ciphers HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file    
ssl_dh_params_file  
ssl_ecdh_curve  prime256v1
ssl_key_file    server.key
ssl_passphrase_command  
ssl_passphrase_command_supports_reload  off
ssl_prefer_server_ciphers   on
standard_conforming_strings on
statement_timeout   0
stats_temp_directory    pg_stat_tmp
superuser_reserved_connections  3
synchronize_seqscans    on
synchronous_commit  on
synchronous_standby_names   
syslog_facility local0
syslog_ident    postgres
syslog_sequence_numbers on
syslog_split_messages   on
tcp_keepalives_count    9
tcp_keepalives_idle 7200
tcp_keepalives_interval 75
temp_buffers    1024
temp_file_limit -1
temp_tablespaces    
TimeZone    UTC
timezone_abbreviations  Default
trace_notify    off
trace_recovery_messages log
trace_sort  off
track_activities    on
track_activity_query_size   1024
track_commit_timestamp  off
track_counts    on
track_functions none
track_io_timing off
transaction_deferrable  off
transaction_isolation   read committed
transaction_read_only   off
transform_null_equals   off
unix_socket_directories /var/run/postgresql
unix_socket_group   
unix_socket_permissions 0777
update_process_title    on
vacuum_cleanup_index_scale_factor   0.1
vacuum_cost_delay   0
vacuum_cost_limit   200
vacuum_cost_page_dirty  20
vacuum_cost_page_hit    1
vacuum_cost_page_miss   10
vacuum_defer_cleanup_age    0
vacuum_freeze_min_age   50000000
vacuum_freeze_table_age 150000000
vacuum_multixact_freeze_min_age 5000000
vacuum_multixact_freeze_table_age   150000000
wal_block_size  8192
wal_buffers 512
wal_compression off
wal_consistency_checking    
wal_keep_segments   0
wal_level   replica
wal_log_hints   off
wal_receiver_status_interval    10
wal_receiver_timeout    60000
wal_retrieve_retry_interval 5000
wal_segment_size    16777216
wal_sender_timeout  60000
wal_sync_method fdatasync
wal_writer_delay    200
wal_writer_flush_after  128
work_mem    4096
xmlbinary   base64
xmloption   content
zero_damaged_pages  off

解释(分析、缓冲、格式化文本):

  Buffers: shared hit=2940 read=1601607
  ->  Gather  (cost=1655905.64..1655905.85 rows=2 width=8) (actual time=191335.534..191349.259 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=2940 read=1601607
        ->  Partial Aggregate  (cost=1654905.64..1654905.65 rows=1 width=8) (actual time=191277.227..191277.227 rows=1 loops=3)
              Buffers: shared hit=2940 read=1601607
              ->  Parallel Seq Scan on forecast  (cost=0.00..1644833.91 rows=4028691 width=0) (actual time=8.037..190681.354 rows=3222220 loops=3)
                    Buffers: shared hit=2940 read=1601607
Planning Time: 0.351 ms
Execution Time: 191349.424 ms

索引:

CREATE INDEX forecast_init_utc ON public.forecast USING btree (forecast_init_utc)
CREATE UNIQUE INDEX forecast_pkey ON public.forecast USING btree (ogc_fid)
CREATE INDEX forecast_wkb_geometry_geom_idx ON public.forecast USING gist (wkb_geometry)
CREATE INDEX forecast_init_local ON public.forecast USING btree (forecast_init_local)
CREATE INDEX country_code ON public.forecast USING btree (country_code)
CREATE INDEX store_num ON public.forecast USING btree (store_num)

编辑:看来VACUUM ANALYZE 成功了。运行了 728 秒,但现在查询很快返回。我需要多久运行一次?

这是新的解释(分析、缓冲区、格式化文本):

Finalize Aggregate  (cost=204220.77..204220.78 rows=1 width=8) (actual time=7116.754..7116.755 rows=1 loops=1)
  Buffers: shared hit=525705 read=25977
  ->  Gather  (cost=204220.55..204220.77 rows=2 width=8) (actual time=7115.613..7123.979 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=525705 read=25977
        ->  Partial Aggregate  (cost=203220.55..203220.57 rows=1 width=8) (actual time=7051.177..7051.178 rows=1 loops=3)
              Buffers: shared hit=525705 read=25977
              ->  Parallel Index Only Scan using store_num on forecast  (cost=0.43..193125.85 rows=4037882 width=0) (actual time=18.684..6564.663 rows=3229577 loops=3)
                    Heap Fetches: 0
                    Buffers: shared hit=525705 read=25977
Planning Time: 0.896 ms
Execution Time: 7124.113 ms

【问题讨论】:

哪个 Postgres 版本?表layer 上是否有主键?那张桌子吸尘良好吗?您使用的是哪种硬件(磁盘类型、内存、CPU)? edit您的问题并添加使用explain (analyze, buffers, format text)生成的查询的execution plan(不是 只是一个“简单”的解释)如formatted text 并确保您防止缩进计划。粘贴文本,然后将``` 放在计划前一行和计划后一行。还请包括所有索引的完整 create index 语句。 PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, 由 gcc (Debian 6.3.0-18+deb9u1) 编译 6.3.0 20170516, 64-bit 我很惊讶 Postgres 没有选择 PK 索引上的扫描,它应该小于表上的 Seq Scan。在这样的环境中,吞吐量约为 60MB/秒。每个块平均有 4 行。这可以表示一个臃肿的表(vacuum full 可以解决这个问题)或一个非常宽的表。如果您在表上运行vacuum analyze,Postgres 是否开始使用仅索引扫描(注意:没有full 选项)。 您应该配置您的系统,以使清理对所有表或仅对那个表更具侵略性。或者检查您是否有 idle in transaction 的会话,这些会话会阻止定期清理 【参考方案1】:

@a_horse_with_no_name 建议在表上运行vaccum analyze 产生了巨大的影响,但是有一个设置可以完全关闭这些select count(*) 查询。 , 它位于管理面板中的Layers -> <Layer Name> -> Publishing -> Skip the counting of the numberMatched attribute

【讨论】:

以上是关于带有大表的 Geoserver WFS + PostgreSQL 速度极慢的主要内容,如果未能解决你的问题,请参考以下文章

openlayers6结合geoserver利用WFS服务实现图层删除功能(附源码下载)

WFS-T 对 Geoserver 的调用未可视化 POI

Geoserver 和 Openlayers - 在 WFS-T 中显示详细的消息错误

leaflet结合geoserver利用WFS服务实现图层删除功能(附源码下载)

cesium结合geoserver利用WFS服务实现图层编辑(附源码下载)

leaflet结合geoserver利用WFS服务实现图层新增功能(附源码下载)