MongoDB:***在将独立集转换为副本集期间出现 fassert() 失败错误后中止

Posted

技术标签:

【中文标题】MongoDB:***在将独立集转换为副本集期间出现 fassert() 失败错误后中止【英文标题】:MongoDB: ***aborting after fassert() failure error during converting a standalone to a replica set 【发布时间】:2021-03-11 11:27:44 【问题描述】:

Mongo 版本:mongo:4.2.6

我正在按照手册Convert a Standalone to a Replica Set 在副本集模式下运行 MongoDB。

当我尝试使用命令 mongod --replSet rs0 启动 MongoDB 时 - 我得到了下一个日志:***fassert() 失败后中止

mongodb-hotbot_1    | 2020-11-29T07:54:06.233+0000 I  CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 W  ASIO     [main] No TransportLayer configured during NetworkInterface startup
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=8c7762d33a84
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] db version v4.2.6
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] git version: 20364840b8f1af16917e4c23c1b5f5efd8b352f8
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.1.1  11 Sep 2018
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] allocator: tcmalloc
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] modules: none
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] build environment:
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten]     distmod: ubuntu1804
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten]     distarch: x86_64
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten]     target_arch: x86_64
mongodb-hotbot_1    | 2020-11-29T07:54:06.235+0000 I  CONTROL  [initandlisten] options:  net:  bindIp: "*" , replication:  replSet: "rs0"  
mongodb-hotbot_1    | 2020-11-29T07:54:06.237+0000 W  STORAGE  [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty.
mongodb-hotbot_1    | 2020-11-29T07:54:06.240+0000 I  STORAGE  [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
mongodb-hotbot_1    | 2020-11-29T07:54:06.241+0000 W  STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
mongodb-hotbot_1    | 2020-11-29T07:54:06.242+0000 I  STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=483M,cache_overflow=(file_max=0M),session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress],
mongodb-hotbot_1    | 2020-11-29T07:54:06.737+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636446:737260][1:0x7fb7b9e5bb00], txn-recover: Recovering log 548 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.233+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:233737][1:0x7fb7b9e5bb00], txn-recover: Recovering log 549 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.735+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:735489][1:0x7fb7b9e5bb00], txn-recover: Main recovery loop: starting at 548/256 to 549/256
mongodb-hotbot_1    | 2020-11-29T07:54:07.739+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:739240][1:0x7fb7b9e5bb00], txn-recover: Recovering log 548 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.792+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:792369][1:0x7fb7b9e5bb00], txn-recover: Recovering log 549 through 549
mongodb-hotbot_1    | 2020-11-29T07:54:07.826+0000 I  STORAGE  [initandlisten] WiredTiger message [1606636447:826673][1:0x7fb7b9e5bb00], txn-recover: Set global recovery timestamp: (1598047236, 1)
mongodb-hotbot_1    | 2020-11-29T07:54:07.849+0000 I  RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(1598047236, 1)
mongodb-hotbot_1    | 2020-11-29T07:54:07.890+0000 I  STORAGE  [initandlisten] Starting OplogTruncaterThread local.oplog.rs
mongodb-hotbot_1    | 2020-11-29T07:54:07.890+0000 I  STORAGE  [initandlisten] The size storer reports that the oplog contains 1372748 records totaling to 335339428 bytes
mongodb-hotbot_1    | 2020-11-29T07:54:07.890+0000 I  STORAGE  [initandlisten] Sampling the oplog to determine where to place markers for truncation
mongodb-hotbot_1    | 2020-11-29T07:54:07.901+0000 I  STORAGE  [initandlisten] Sampling from the oplog between Aug  6 11:40:28:1 and Aug 22 01:00:20:2 to determine where to place markers for truncation
mongodb-hotbot_1    | 2020-11-29T07:54:07.901+0000 I  STORAGE  [initandlisten] Taking 24 samples and assuming that each section of oplog contains approximately 554226 records totaling to 135388162 bytes
mongodb-hotbot_1    | 2020-11-29T07:54:07.992+0000 I  STORAGE  [initandlisten] Placing a marker at optime Feb 17 12:37:42:1
mongodb-hotbot_1    | 2020-11-29T07:54:07.992+0000 I  STORAGE  [initandlisten] Placing a marker at optime Jun 16 13:50:38:524
mongodb-hotbot_1    | 2020-11-29T07:54:07.992+0000 I  STORAGE  [initandlisten] WiredTiger record store oplog processing took 101ms
mongodb-hotbot_1    | 2020-11-29T07:54:07.995+0000 I  STORAGE  [initandlisten] Timestamp monitor starting
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten]
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
mongodb-hotbot_1    | 2020-11-29T07:54:07.997+0000 I  CONTROL  [initandlisten]
mongodb-hotbot_1    | 2020-11-29T07:54:08.057+0000 I  SHARDING [initandlisten] Marking collection local.system.replset as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.074+0000 I  STORAGE  [initandlisten] Flow Control is enabled on this deployment.
mongodb-hotbot_1    | 2020-11-29T07:54:08.074+0000 I  SHARDING [initandlisten] Marking collection admin.system.roles as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.074+0000 I  SHARDING [initandlisten] Marking collection admin.system.version as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.083+0000 I  SHARDING [initandlisten] Marking collection local.startup_log as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.084+0000 I  FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
mongodb-hotbot_1    | 2020-11-29T07:54:08.085+0000 I  SHARDING [initandlisten] Marking collection local.replset.minvalid as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.086+0000 I  SHARDING [initandlisten] Marking collection local.replset.election as collection version: <unsharded>
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 I  REPL     [initandlisten] Rollback ID is 1
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 F  REPL     [initandlisten] This instance has been repaired and may contain modified replicated data that would not match other replica set members. To see your repaired data, start mongod without the --replSet option. When you are finished recovering your data and would like to perform a complete re-sync, please refer to the documentation here: https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 F  -        [initandlisten] Fatal Assertion 50923 at src/mongo/db/repl/replication_coordinator_impl.cpp 527
mongodb-hotbot_1    | 2020-11-29T07:54:08.096+0000 F  -        [initandlisten]
mongodb-hotbot_1    |
mongodb-hotbot_1    | ***aborting after fassert() failure
mongodb-hotbot_1    |
mongodb-hotbot_1    |

这是中止过程之前日志末尾的消息:

此实例已修复,可能包含与其他副本集成员不匹配的已修改复制数据。要查看修复的数据,请在不带 --replSet 选项的情况下启动 mongod。当您完成数据恢复并希望执行完全重新同步时,请参阅此处的文档:https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/

所以我尝试按照下一个手册Recover a Standalone after an Unexpected Shutdown修复我的数据

最后我在我的 dbPath 目录中得到了空文件mongod.lock,据我所知,这是恢复成功完成的信号。

然后我以独立模式启动 mongodb,一切正常,所以我优雅地关闭了它,并仔细检查了 mongod.lock 文件之后仍然为空。

最后我尝试使用空的 mongod.lock 启动命令 mongod --replSet rs0,但我再次遇到相同的错误,mongod.lock 文件更新后第一行为 1...

任何想法如何解决此问题并使用我的数据以副本集模式启动 mongo?

【问题讨论】:

【参考方案1】:

这是我为解决问题所做的工作

    使用我的数据目录启动修复 mongod 进程 - mongod --dbpath /data/db --repair 使用我的数据目录启动一个独立的 mongod – mongod --dbpath /data/db 转储 - mongodump --host=localhost --port=27017 --out=/tmp/dumps/1 通过 mongo shell 在本地连接到 db 并优雅地关闭它 - mongo "mongodb://localhost:27017/admin" & db.shutdownServer() 在 RS 模式下使用新的干净数据目录运行新的 mongod – mongod --dbpath /data/db_recovered --replSet rs0 通过 mongo shell 以 RS 模式本地连接到 db 并运行命令 - mongo "mongodb://localhost:27017/admin" & rs.initiate() & db.isMaster() 从转储中恢复数据 – mongorestore --host=localhost --port=27017 /tmp/dumps/1 最后我让 mognod 使用我的数据在 RS 模式下运行?

【讨论】:

【参考方案2】:

在我看来,您执行了 mongod 的非正常关闭,可能是在它已经作为 RS 节点运行之后,并且您得到了回滚。此时数据库拒绝作为 RS 节点启动,因为任何回滚的数据都会丢失。

由于您没有要重新同步的任何节点,我建议您从备份中恢复您开始使用的独立数据目录(您已经使用了一个,对吗?)并再次进行转换,这次注意优雅地关闭 mongod,直到您拥有一个包含多个节点的可操作副本集。

或者,您可以尝试从该数据目录启动一个独立的 mongod,使用 mongodump 获取完整的数据转储,创建一个新的独立部署,将数据 mongorestore 到其中,然后重复转换过程到 RS 节点。

【讨论】:

非常感谢!我选择了您建议的替代(第二种)方式。我在上面的回答中基于它发布了我的操作的操作列表。希望它能帮助有类似问题的人。

以上是关于MongoDB:***在将独立集转换为副本集期间出现 fassert() 失败错误后中止的主要内容,如果未能解决你的问题,请参考以下文章

搭建高可用MongoDB集群(Replica set)

Mongodb 4.2版本副本集配置

mongodb10---分片

mongo 3.4分片集群系列之一:浅谈分片集群

初识MongoDB

将独立的 MongoDB 实例转换为单节点副本集