cassandra-stress “无法通过 JMX 连接;未收集这些统计信息”
Posted
技术标签:
【中文标题】cassandra-stress “无法通过 JMX 连接;未收集这些统计信息”【英文标题】:cassandra-stress "Failed to connect over JMX; not collecting these stats" 【发布时间】:2015-06-02 12:05:35 【问题描述】:我今天第一次尝试使用 cassandra-stress 工具。虽然我能够运行该工具,但输出中显示了很多“无法通过 JMX 连接;未收集这些统计信息”消息
命令
cassandra-stress user \
profile=./stress_write.yaml ops\(insert=1\) \
n=1000000 \
-log file=./stress_write.log \
-node node1,node2,node3,node4,node5,node6
输出
WARN 19:44:25 Found host with 0.0.0.0 as rpc_address, using listen_address (/node5) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:25 Found host with 0.0.0.0 as rpc_address, using listen_address (/node1) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:25 Found host with 0.0.0.0 as rpc_address, using listen_address (/node2) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:25 Found host with 0.0.0.0 as rpc_address, using listen_address (/node4) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:25 Found host with 0.0.0.0 as rpc_address, using listen_address (/node3) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:26 Found host with 0.0.0.0 as rpc_address, using listen_address (/node5) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:26 Found host with 0.0.0.0 as rpc_address, using listen_address (/node1) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:26 Found host with 0.0.0.0 as rpc_address, using listen_address (/node2) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:26 Found host with 0.0.0.0 as rpc_address, using listen_address (/node4) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
WARN 19:44:26 Found host with 0.0.0.0 as rpc_address, using listen_address (/node3) to contact it instead. If this is incorrect you should avoid the use of 0.0.0.0 server side.
INFO 19:44:26 Using data-center name 'DC2' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
INFO 19:44:26 New Cassandra host /node2:9042 added
INFO 19:44:26 New Cassandra host /node5:9042 added
Connected to cluster: MyCluster
INFO 19:44:26 New Cassandra host /node4:9042 added
INFO 19:44:26 New Cassandra host /node1:9042 added
INFO 19:44:26 New Cassandra host /node6:9042 added
Datatacenter: DC2; Host: /node4; Rack: rack1
Datatacenter: DC2; Host: /node3; Rack: rack1
Datatacenter: DC2; Host: /node6; Rack: rack1
Datatacenter: DC2; Host: /node5; Rack: rack1
Datatacenter: DC2; Host: /node1; Rack: rack1
Datatacenter: DC2; Host: /node2; Rack: rack1
INFO 19:44:26 New Cassandra host /node3:9042 added
Created schema. Sleeping 6s for propagation.
Failed to connect over JMX; not collecting these stats
Generating batches with [1..1] partitions and [1..1] rows (of [1..1] total rows in the partitions)
Failed to connect over JMX; not collecting these stats
Failed to connect over JMX; not collecting these stats
Improvement over 4 threadCount: 36%
Failed to connect over JMX; not collecting these stats
Improvement over 8 threadCount: 138%
Failed to connect over JMX; not collecting these stats
Improvement over 16 threadCount: 48%
Failed to connect over JMX; not collecting these stats
Improvement over 24 threadCount: 33%
Failed to connect over JMX; not collecting these stats
Improvement over 36 threadCount: 27%
Failed to connect over JMX; not collecting these stats
Improvement over 54 threadCount: 39%
Failed to connect over JMX; not collecting these stats
Improvement over 81 threadCount: 37%
Failed to connect over JMX; not collecting these stats
Improvement over 121 threadCount: 16%
Failed to connect over JMX; not collecting these stats
Improvement over 181 threadCount: 1%
Failed to connect over JMX; not collecting these stats
Improvement over 271 threadCount: 15%
Failed to connect over JMX; not collecting these stats
Improvement over 406 threadCount: 3%
Failed to connect over JMX; not collecting these stats
Improvement over 609 threadCount: -3%
我需要为 JMX 指定任何命令行或基于文件的配置参数吗?我已经测试并确认压力机器和我的节点之间的连接不是问题,因为我能够通过 jmxsh 在它们之间建立连接。
输出的另一个问题可能与 JMX 错误有关,也可能无关,是它缺少一些关键部分。我引用了这个 Datastax documentation page 的示例输出,以显示我得到的部分缺少的部分:
WARNING: uncertainty mode (err<) results in uneven workload between thread runs, so should be used for high level analysis only
Running with 4 threadCount
Running WRITE with 4 threads until stderr of mean < 0.02
total ops , adj row/s, op/s, pk/s, row/s, mean, med, .95, .99, .999, max, time, stderr, gc: #, max ms, sum ms, sdv ms, mb
2552 , 2553, 2553, 2553, 2553, 1.5, 1.4, 2.5, 6.0, 12.6, 18.0, 1.0, 0.00000, 0, 0, 0, 0, 0
5173 , 2634, 2613, 2613, 2613, 1.5, 1.5, 1.8, 2.6, 8.6, 9.2, 2.0, 0.00000, 0, 0, 0, 0, 0
...
Results:
op rate : 3954
partition rate : 3954
row rate : 3954
latency mean : 1.0
latency median : 0.8
latency 95th percentile : 1.5
latency 99th percentile : 1.8
latency 99.9th percentile : 2.2
latency max : 73.6
total gc count : 25
total gc mb : 1826
total gc time (s) : 1
avg gc time(ms) : 37
stdev gc time(ms) : 10
Total operation time : 00:00:59
Sleeping for 15s
Running with 4 threadCount
备注
我的集群正在运行 DSE 4.6.1 (Cassandra 2.0.12) 我正在另一台机器上运行压力工具 压力工具版本来自 DSC 2.1 (Cassandra 2.1)【问题讨论】:
我看到了同样的问题...没有时间调查它。 【参考方案1】:我有相同的设置(Cassandra 版本是 2.0.12),压力工具来自 2.1,并且看到了类似的问题。 终于有时间调查了。
我下载了源代码并在调试器中运行它。我看到的是这个错误信息具有误导性。该工具连接到 JMX,但其中一个 mBean (org.apache.cassandra.service:type=GCInspector
) 存在问题。
当我使用以下选项运行压力测试时,我看到了相同的异常:-log level=verbose
,并看到了以下异常:
java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy11.getAndResetStats(Unknown Source)
at org.apache.cassandra.tools.NodeProbe.getAndResetGCStats(NodeProbe.java:385)
at org.apache.cassandra.stress.util.JmxCollector.<init>(JmxCollector.java:86)
at org.apache.cassandra.stress.StressMetrics.<init>(StressMetrics.java:64)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:187)
at org.apache.cassandra.stress.StressAction.warmup(StressAction.java:97)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:61)
at org.apache.cassandra.stress.Stress.main(Stress.java:109)
Caused by: javax.management.InstanceNotFoundException: org.apache.cassandra.service:type=GCInspector
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(Unknown Source)
at ....
我使用 jConsole 连接到 Cassandra,但 2.0.12 版没有这个 mBean。
但我的输出包含示例中引用的大部分数据(垃圾收集统计信息除外)。
您是否尝试过使用默认配置运行 cassandra-stress?也可以尝试设置详细的日志记录,也许它会给你一些想法。
【讨论】:
您是如何解决这个问题的?【参考方案2】: 我也遇到了同样的问题(Cassandra 3.7),我用 -log level=verbose 运行我的 Cassandra-stress 客户端并看到以下异常:
java.lang.RuntimeException: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exce4; nested exception is:
java.net.ConnectException: Connection timed out]
at org.apache.cassandra.stress.util.JmxCollector.connect(JmxCollector.java:99)
at org.apache.cassandra.stress.util.JmxCollector.(JmxCollector.java:85)
at org.apache.cassandra.stress.StressMetrics.(StressMetrics.java:62)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:211)
at org.apache.cassandra.stress.StressAction.warmup(StressAction.java:107)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:60)
at org.apache.cassandra.stress.Stress.run(Stress.java:133)
at org.apache.cassandra.stress.Stress.main(Stress.java:61)
Caused by: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmion is:
java.net.ConnectException: Connection timed out]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369)
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:188)
at org.apache.cassandra.tools.NodeProbe.(NodeProbe.java:155)
at org.apache.cassandra.stress.util.JmxCollector.connect(JmxCollector.java:95)
... 7 more
Caused by: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 1.2.3.4;
java.net.ConnectException: Connection timed out]
at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:122)
at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205)
at javax.naming.InitialContext.lookup(InitialContext.java:417)
at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1957)
at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1924)
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287)
... 11 more
Caused by: java.rmi.ConnectException: Connection refused to host: 1.2.3.4; nested exception is:
java.net.ConnectException: Connection timed out
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:342)
at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:118)
... 16 more
因此,为了解决此问题,我已将 Cassandra.yaml 文件中的 rpc_address 属性设置为
这对我有用,我不再收到该错误。
【讨论】:
【参考方案3】:编辑 conf/cassandra.yaml
将 rpc_address: localhost 更改为
rpc_address: 0.0.0.0
重启数据库
【讨论】:
【参考方案4】:在 cassandra-env.sh 文件中打开 JMX 端口到世界,然后重新启动 Cassandra 服务。压力测试完成后,您可以恢复 JMX 端口更改。
【讨论】:
以上是关于cassandra-stress “无法通过 JMX 连接;未收集这些统计信息”的主要内容,如果未能解决你的问题,请参考以下文章