markdown 使用wget下载整个网站以供离线使用。

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了markdown 使用wget下载整个网站以供离线使用。相关的知识,希望对你有一定的参考价值。

##The best way to download a website for offline use, using wget

There are two ways - the first way is just one command run plainly in front of you; the second one runs in the background and in a different instance so you can get out of your ssh session and it will continue.

First make a folder to download the websites to and begin your downloading: (note if downloading `www.SOME_WEBSITE.com`, you will get a folder like this: `/websitedl/www.SOME_WEBSITE.com/`)

<br>

###STEP 1:

````bash
mkdir ~/websitedl/
cd ~/websitedl/
````

Now choose for Step 2 whether you want to download it simply (1st way) or if you want to get fancy (2nd way).

<br>

###STEP 2:

####1st way:

````bash
wget --limit-rate=200k --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://www.SOME_WEBSITE.com
````

####2nd way:

#####TO RUN IN THE BACKGROUND:

````bash
nohup wget --limit-rate=200k --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://www.SOME_WEBSITE.com &
````

#####THEN TO VIEW OUTPUT (there will be a nohup.out file in whichever directory you ran the command from):

````bash
tail -f nohup.out
````

<br>

####WHAT DO ALL THE SWITCHES MEAN:

`--limit-rate=200k` limit download to 200 Kb /sec

`--no-clobber` don't overwrite any existing files (used in case the download is interrupted and
resumed).

`--convert-links` convert links so that they work locally, off-line, instead of pointing to a website online

`--random-wait` random waits between download - websites dont like their websites downloaded

`-r` recursive - downloads full website

`-p` downloads everything even pictures (same as --page-requsites, downloads the images, css stuff and so on)

`-E` gets the right extension of the file, without most html and other files have no extension

`-e robots=off` act like we are not a robot - not like a crawler - websites dont like robots/crawlers unless they are google/or other famous search engine

`-U mozilla` pretends to be just like a browser Mozilla is looking at a page instead of a crawler like wget

####PURPOSELY **DIDN'T** INCLUDE THE FOLLOWING:

`-o=/websitedl/wget1.txt` log everything to wget_log.txt - didn't do this because it gave me no output on the screen and I don't like that.

`-b` runs it in the background and I can't see progress... I like "nohup <commands> &" better

`--domain=steviehoward.com` didn't include because this is hosted by Google so it might need to step into Google's domains

`--restrict-file-names=windows`  modify filenames so that they will work in Windows as well. Seems to work okay without this.

<br>
<br>

######tested with zsh 5.0.5 (x86_64-apple-darwin14.0) on Apple MacBook Pro (Late 2011) running OS X 10.10.3

[credit](http://www.kossboss.com/linux---wget-full-website)

以上是关于markdown 使用wget下载整个网站以供离线使用。的主要内容,如果未能解决你的问题,请参考以下文章

如何在flutter mobile中保存网站以供离线使用

缓存图像以供离线使用 SDWebImage

下载文件以供离线使用 Cordova/PhoneGap 构建

下载sql表的值以供离线重用

在 PhoneGap/Chrome 应用程序中存储视频以供离线使用

如何在 Android 中存储媒体 DRM 密钥以供离线使用