17. ClustrixDB 日志管理

Posted yuxiaohao

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了17. ClustrixDB 日志管理相关的知识,希望对你有一定的参考价值。

 

ClustrixDB记录关于重要和有问题的查询的详细信息。这些日志有助于确定以下事项:

  • 慢速查询
  • 资源争用
  • SQL错误
  • 读取意外数量行的查询
  • 模式变化
  • 全局变量的修改
  • 集群的改变

默认情况下,查询日志是启用的,日志存储在/data/clustrix/log/中。

每个节点将记录其运行的查询的信息,同时充当全局事务管理器(GTM)。为了评估集群范围内的问题,常常需要合并来自所有节点的日志。使用clx logdump来合并和评估日志。

 

管理查询日志

查询类型

query.log中的每个条目都被归类为这些类型之一。每个查询类型的特定日志记录由指定的全局变量或会话变量控制。

Query Type
Description
SQLERR 这些数据库错误包括语法错误、超时通知和权限问题。默认情况下,所有的SQLERR查询都将被记录下来(session_log_error_queries)。
SLOW 查询执行时间超过了session_log_slow_threshold_ms指定的阈值。(慢SQL)
DDL

记录 CREATE, DROP, ALTER),SET GLOBAL or SESSION command. 默认所有DDL都被记录(session_log_ddl).

BAD 查询读取的行数超过返回预期结果所需的行数。这可能表示计划不好或缺少索引,默认关闭(session_log_bad_queries).
ALTER CLUSTER 通过ALTER cluster命令对集群所做的更改总是自动记录到query.log中。此日志记录不受全局变量控制。

 

全局变量和会话变量

用于控制查询日志记录的一些变量可能由会话指定,而一些变量仅在系统范围内可用。要设置与日志记录相关的任何变量的值,请使用以下语法:

SET [GLOBAL | SESSION] variable_name = desired_value; 

这些变量控制查询和用户日志记录。这里显示的默认设置对于大多数安装来说都是可以接受的。

Name
Description
Default Value
Session Variable
session_log_bad_queries Log BAD queries to the query.log false

技术图片

session_log_ddl Log DDL statements to query.log true  
session_log_error_queries Log ERROR statements to query.log true  
session_log_slow_queries Log SLOW statements to query.log true  
session_log_slow_threshold_ms Query duration threshold in milliseconds before logging this query 10000

技术图片

session_log_users Log LOGIN/LOGOUT to user.log false  

 

 

阅读query.log

组件

每个日志条目都以标识数据开始,并包含重要信息,以帮助解决集群中的问题。下面是日志条目的布局。

[timestamp] [hostname] clxnode INSTR [query type] [sid] [db] [user] [ac] [xid] [sql] [status] [time and breakdowns] [internal counters]

 

Label                              Description
timestamp 日期和时间,包括时区。时间同步在所有节点上是非常重要的。
hostname 节点ID和记录条目的主机的名称。此节点充当此事务的GTM。
process name ClustrixDB进程名(clxnode).
INSTR 这个固定的冗余出现在每行的查询类型之前。
query type    定义日志的类型:  SLOW, DDL, BAD, SQLERR, ALTER CLUSTER.
SID   会话ID:用于对给定会话的活动进行分组。
db   DB name
user   执行查询的用户。如果使用基于语句的复制,则在排除来自主服务器的语句故障时搜索复制帐户。
ac   自动提交(Y / N)指标。这对于确定查询是否在用户定义的显式事务中使用非常有用。DDL使用内部生成的显式事务,并且始终为N。
xid   事务ID。在排除锁定问题时,将会话链接到XID非常有用。
sql 这是完整查询的文本。省略号表示文本被截断以适应4KB的限制。
status   方括号中包含的查询结果。例如,这可能是受影响的行或错误消息。
time   从接收、编译和处理查询到返回输出或发生错误所花费的总时间。这在分析慢速查询时特别有用。
对于执行时间超过一个ms的任何查询,运行时间将进一步细分。
translate Time spent in translate_dml(). 
prefetch Time spent building the Sierra stub.
plan Time spent to plan and normalize the query.
compile Time spent in compiling Sierra.
execute Time spent in invocation.

 

内部计数器`

 
 
Label
Description
reads   The number of times the database reads from a container. This may differ from the number of rows_read.
inserts   The number of times the database inserts into a container. This includes both the number of calls and the number of rows written.
deletes   The number of times the database deletes from a container. This includes both the number of calls and the number of rows deleted.
updates   The number of times the database updates a container. This includes both the number of calls and the number of rows updated.
counts   Number of calls by the query execution engine to operators BARRIER_ADD and BARRIER_FETCHADD.
rows_read  

Total number of rows read to get all needed data for the query, including reads from indices. Essentially, the total number of rows processed by the last query. This may differ from from the number of rows_output by the query.

forwards Number of rows forwarded to specific nodes.
broadcasts   Number of rows that were broadcast to all nodes.
rows_output   Total number of rows returned or output by the last query. This is usually the same as the number of rows returned from a query but may occasionally contain counts from internal processes.
semaphore_matches   Number of calls by the query execution engine to operator SEM_ACQUIRE.
fragment_executions   Number of query fragments executed for the query.
cpu_runtime_ns   This represents the aggregate total CPU time spent by all nodes to run the query.
cpu_waits   The number of times the query waited for another query to finish due to the Fair Scheduler.
cpu_waittime_ns   The amount of time spent waiting for CPU due to the Fair Scheduler.
barriers   Number of barriers created for the query. This is used to synchronize message communication between nodes.
barrier_forwards   Number of barriers created to synchronize messaging for forwarded rows.
barrier_flushes   Number of flush operations performed on barriers.
bm_fixes   Number of attempted page fixes by the Buffer Manager.
bm_loads   Number of pages loaded from disk by the Buffer Manager.
bm_waittime_ns   Nanoseconds spent blocked on Buffer Manager page fixes.
lockman_waits   Count of the number of times that the query had to wait for a lock to be released by another query.
lockman_waittime_ms   The total time spent waiting for other queries to release locks on needed rows.
trxstate_waits   Number of calls to trxstate_check that had to block.
trxstate_waittime_ms    Milliseconds spent blocked in trxstate_check.
wal_perm_waittime_ms Milliseconds spent waiting because the WAL is more than 75% full.
bm_perm_waittime_ms  Milliseconds spent waiting for the Buffer Manager to grant write permission for pages.
sigmas   The number of sigma containers used by the query.
sigma_fallbacks   The number of sigma containers that ran out of memory and had to fall back to disk.
row_count   The total number of rows updated, inserted or deleted by the last query.
found_rows   The number of rows affected by the last statement, but not necessarily output by that statement . A value of 0 or -1 means no rows were found.
insert_id   Not currently being used, always displayed as 0.
fanout   Y/N indicator that tells if fanout was used for this query.
attempts Number of attempts to automatically retry the query execution after it failed.

 

 

 

 

以上是关于17. ClustrixDB 日志管理的主要内容,如果未能解决你的问题,请参考以下文章

16. ClustrixDB Rebalancer

argparse 代码片段只打印部分日志

36. ClustrixDB 使用ClustrixDB加密连接

33. ClustrixDB 扩展集群的容量-Flex up

28. ClustrixDB 评估模型

26. ClustrixDB 分布式架构/数据分片