Experience on Namenode backup and restore --- checkpoint
Posted yangykaifa
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Experience on Namenode backup and restore --- checkpoint相关的知识,希望对你有一定的参考价值。
Hadoop version: Hadoop 2.2.0.2.0.6.0-0009
Well, We can do this by building Secondary Namenode, Checkpoint node or Backup node.
Example:
Assuming you have a Secondary Namenode.
1. Check secondary namenode checkpoint status:
dfs.namenode.secondary.http-address in %HADOOP_CONF_DIR%/hdfs-site.xml
fs.namenode.checkpoint.dir in %HADOOP_CONF_DIR%/hdfs-site.xml
dfs.namenode.checkpoint.edits.dir in %HADOOP_CONF_DIR%/hdfs-site.xml
dfs.namenode.checkpoint.period in %HADOOP_CONF_DIR%/hdfs-site.xml
2. Backup your real time checkpoint by hand:
On Secondary namenode, Stop Hadoop secondary namenode service.
Run cmd.exe by user hadoop ( or some users have full permission )
- Runas /user:hadoop cmd.exe
Backup real time checkpoint:
- cmd>%hadoop_home%/bin/hadoop secondarynamenode -checkpoint force
3. Stop Namenode services or reboot Namenode ( if hadoop service set to booting manual ,the services would all stop after reboot )
As for test, I backup my dfs.namenode.name.dir (i.e C:\hdpdata\hdfs\nn) first for my next test ( restore from my namenode dir backup ) .
Delete all files in C:\hdpdata\hdfs\nn ,
Open dfs.namenode.checkpoint.dir (see %HADOOP_CONF_DIR%/hdfs-site.xml ) in secondary namenode (i.e. c:\hdpdata\hdfs\snn )
Copy all secondary checkpoint files( except the lock file) from this folder to your namenode‘s checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)
Make sure namenode‘s checkpoint dir is empty already !
4. Restore from checkpoint dir
Run cmd.exe by user hadoop ( or some users have full permission )
- Runas /user:hadoop cmd.exe
Use this command to start hadoop service and import checkpoint from checkpoint dir
- cmd>%hadoop_home%/bin/hdfs namenode -importcheckpoint
Use ctrl+C to stop service which is completed. and Delete your namenode‘s checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)
Start service by this command:
- cmd>start_local_hdp_services.cmd
Levae safemode
- cmd>%hadoop_home%/bin/hdfs dfsadmin -safemode leave
Balance you HDFS:
- cmd>%hadoop_home%/bin/hdfs balancer -threshold 5
5. Confirm your Hadoop service is restored successfully.
Open URL http://namenode:50070/ to check if there are some missing block. If yes. Please kindly check where they are and what they are.
Because restore from secondary namenode isn‘t a real time restore solution. It may lost the last time what you do in the jobtracker. It doesn‘t matter. Just delete them.
Tips: If you want to restore a real time backup, please use multiplicate namenode dir mode. see next post... ...
以上是关于Experience on Namenode backup and restore --- checkpoint的主要内容,如果未能解决你的问题,请参考以下文章
Experience with Foxwell NT510 scanner on 2002 GM C1500
Hadoop3 启动,提示ERROR: Attempting to operate on hdfs namenode as root ***
Hadoop常见问题2 Attempting to operate on hdfs namenode as root but there is no HDFS_NAMENODE_USER define
Hadoop常见问题2 Attempting to operate on hdfs namenode as root but there is no HDFS_NAMENODE_USER define
Hadoop常见问题2 Attempting to operate on hdfs namenode as root but there is no HDFS_NAMENODE_USER define
Hadoop常见问题2 Attempting to operate on hdfs namenode as root but there is no HDFS_NAMENODE_USER define