从单主服务器迁移到多主服务器 Apache KUDU 配置
Posted
技术标签:
【中文标题】从单主服务器迁移到多主服务器 Apache KUDU 配置【英文标题】:Migration form single master to mutlimaster Apache KUDU configuration 【发布时间】:2017-07-26 15:10:39 【问题描述】:我们更改了 Apache KUDU 的配置。我们在原来的基础上增加了 2 个新的 kudu 大师。
问题:当我们启动 KUDU 时,它会从老领导(原始大师)启动,现在一切正常。但是一段时间后,领导者更改为添加的主人之一,所有查询都开始失败。
> I0726 16:47:11.372854 99507 consensus_queue.cc:695] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc
> [LEADER]: Connected to new peer: Peer:
> b12b38a0d21c4ceda72b40571c34ec52, Is new: false, Last received:
> 28.11387, Next index: 11388, Last known committed idx: 11387, Last exchange result: ERROR, Needs tablet copy: false W0726 16:47:12.373445
> 98703 consensus_peers.cc:357] T 00000000000000000000000000000000 P
> dbe19a36bd1f466ca87b08ebb97f28dc -> Peer
> f47ef1fccc0949b68db09f30e430c3eb (namenode-01.datalab:7051): Couldn't
> send request to peer f47ef1fccc0949b68db09f30e430c3eb for tablet
> 00000000000000000000000000000000. Status: Timed out: UpdateConsensus RPC to 172.26.217.133:7051 timed out after 1.000s (ON_OUTBOUND_QUEUE).
> Retrying in the next heartbeat period. Already tried 1 times. W0726
> 16:47:13.123589 98703 leader_election.cc:272] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc
> [CANDIDATE]: Term 29 election: RPC error from VoteRequest() call to
> peer f47ef1fccc0949b68db09f30e430c3eb: Timed out: RequestConsensusVote
> RPC to 172.26.217.133:7051 timed out after 1.761s (ON_OUTBOUND_QUEUE)
> W0726 16:47:13.323909 98703 leader_election.cc:272] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc
> [CANDIDATE]: Term 29 pre-election: RPC error from VoteRequest() call
> to peer f47ef1fccc0949b68db09f30e430c3eb: Timed out:
> RequestConsensusVote RPC to 172.26.217.133:7051 timed out after 1.969s
> (ON_OUTBOUND_QUEUE) W0726 16:47:13.864181 98703
> consensus_peers.cc:357] T 00000000000000000000000000000000 P
> dbe19a36bd1f466ca87b08ebb97f28dc -> Peer
> f47ef1fccc0949b68db09f30e430c3eb (namenode-01.datalab:7051): Couldn't
> send request to peer f47ef1fccc0949b68db09f30e430c3eb for tablet
> 00000000000000000000000000000000. Status: Timed out: UpdateConsensus RPC to 172.26.217.133:7051 timed out after 1.000s (ON_OUTBOUND_QUEUE).
> Retrying in the next heartbeat period. Already tried 2 times. I0726
> 16:47:14.424320 98727 raft_consensus.cc:887] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc
> [term 29 LEADER]: Rejecting Update request from peer
> f47ef1fccc0949b68db09f30e430c3eb for earlier term 28. Current term is
> 29. Ops: [] W0726 16:47:15.204483 98703 consensus_peers.cc:357] T 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc ->
> Peer f47ef1fccc0949b68db09f30e430c3eb (namenode-01.datalab:7051):
> Couldn't send request to peer f47ef1fccc0949b68db09f30e430c3eb for
> tablet 00000000000000000000000000000000. Status: Timed out:
> UpdateConsensus RPC to 172.26.217.133:7051 timed out after 1.000s
> (SENT). Retrying in the next heartbeat period. Already tried 3 times.
> I0726 16:47:15.536121 99517 consensus_queue.cc:695] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc
> [LEADER]: Connected to new peer: Peer:
> f47ef1fccc0949b68db09f30e430c3eb, Is new: false, Last received:
> 28.11387, Next index: 11388, Last known committed idx: 11387, Last exchange result: ERROR, Needs tablet copy: false W0726 16:47:16.537894
> 98703 consensus_peers.cc:357] T 00000000000000000000000000000000 P
> dbe19a36bd1f466ca87b08ebb97f28dc -> Peer
> f47ef1fccc0949b68db09f30e430c3eb (namenode-01.datalab:7051): Couldn't
> send request to peer f47ef1fccc0949b68db09f30e430c3eb for tablet
> 00000000000000000000000000000000. Status: Timed out: UpdateConsensus RPC to 172.26.217.133:7051 timed out after 1.000s (SENT). Retrying in
> the next heartbeat period. Already tried 1 times. I0726
> 16:47:28.560550 98698 delta_tracker.cc:686] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc:
> Flushing 303 deltas from DMS 11... I0726 16:47:28.562281 98698
> delta_tracker.cc:628] T 00000000000000000000000000000000 P
> dbe19a36bd1f466ca87b08ebb97f28dc: Flushed delta block:
> 0062344469241244 ts range: [6148425451429212160, 6148425454747340800]
> I0726 16:47:28.562363 98698 delta_tracker.cc:641] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc:
> Reopened delta block for read: 0062344469241244 I0726 16:47:28.564522
> 98698 maintenance_manager.cc:419] Time spent running
> FlushDeltaMemStoresOp(00000000000000000000000000000000): real
> 0.004s user 0.002s sys 0.000s I0726 16:47:28.564554 98698 maintenance_manager.cc:425] P dbe19a36bd1f466ca87b08ebb97f28dc:
> FlushDeltaMemStoresOp(00000000000000000000000000000000) metrics:
> "fdatasync":2,"fdatasync_us":1601 I0726 16:49:28.614974 98698
> delta_tracker.cc:686] T 00000000000000000000000000000000 P
> dbe19a36bd1f466ca87b08ebb97f28dc: Flushing 1 deltas from DMS 0...
> I0726 16:49:28.616822 98698 delta_tracker.cc:628] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc:
> Flushed delta block: 0062344469241245 ts range: [6148424539781664768,
> 6148424539781664768] I0726 16:49:28.616896 98698 delta_tracker.cc:641]
> T 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc:
> Reopened delta block for read: 0062344469241245 I0726 16:49:28.619011
> 98698 maintenance_manager.cc:419] Time spent running
> FlushDeltaMemStoresOp(00000000000000000000000000000000): real
> 0.004s user 0.001s sys 0.001s I0726 16:49:28.619043 98698 maintenance_manager.cc:425] P dbe19a36bd1f466ca87b08ebb97f28dc:
> FlushDeltaMemStoresOp(00000000000000000000000000000000) metrics:
> "fdatasync":2,"fdatasync_us":2192 W0726 16:52:36.772328 98703
> connection.cc:462] client connection to 172.26.217.133:7051 recv
> error: Network error: failed to read from TLS socket: Connection reset
> by peer (error 104) W0726 16:52:36.911276 98703
> consensus_peers.cc:357] T 00000000000000000000000000000000 P
> dbe19a36bd1f466ca87b08ebb97f28dc -> Peer
> f47ef1fccc0949b68db09f30e430c3eb (namenode-01.datalab:7051): Couldn't
> send request to peer f47ef1fccc0949b68db09f30e430c3eb for tablet
> 00000000000000000000000000000000. Status: Network error: Client connection negotiation failed: client connection to
> 172.26.217.133:7051: connect: Connection refused (error 111). Retrying in the next heartbeat period. Already tried 1 times. W0726
> 16:52:37.411356 98703 consensus_peers.cc:357] T
> 00000000000000000000000000000000 P dbe19a36bd1f466ca87b08ebb97f28dc ->
> Peer f47ef1fccc0949b68db09f30e430c3eb (namenode-01.datalab:7051):
> Couldn't send request to peer f47ef1fccc0949b68db09f30e430c3eb for
> tablet 00000000000000000000000000000000. Status: Network error: Client
> connection negotiation failed: client connection to
> 172.26.217.133:7051: connect: Connection refused (error 111). Retrying in the next heartbeat period. Already tried 2 times.
有什么想法吗?有人吗?
请,谢谢!
【问题讨论】:
【参考方案1】:您可能必须像这样在 Impala 中更新 TBLPROPERTIES
:
ALTER TABLE table_name SET TBLPROPERTIES(
'kudu.master_addresses' = 'master-01:7051,master-02:7051,master-03:7051'
)
如果有帮助,请告诉我!另见:https://issues.apache.org/jira/browse/IMPALA-5399
【讨论】:
是的,帮助很大!以上是关于从单主服务器迁移到多主服务器 Apache KUDU 配置的主要内容,如果未能解决你的问题,请参考以下文章