Wget使用

Posted 规格严格-功夫到家-哈工大威海人

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Wget使用相关的知识,希望对你有一定的参考价值。

http://www.tuicool.com/articles/A7BRny

wget / curl 是两个比较方便的测试http功能的命令行工具,大多数情况下,测试http功能主要是查看请求响应 头信息 ,而给这两个工具加上适当的命令行参数即可轻易做到,其实查man手册就能找到对应的参数选项,不过这里仍然mark一下。

wget的debug选项:
–debug
Turn on debug output, meaning various information important to the developers of
Wget if it does not work properly. Your system administrator may have chosen to
compile Wget without debug support, in which case -d will not work. Please note
that compiling with debug support is always safe—Wget compiled with the debug
support will not print any debug info unless requested with -d.

实例(可以看到,wget链接请求默认采用的是HTTP/1.0协议):

[[email protected] ~]# wget 127.0.0.1 --debug
DEBUG output created by Wget 1.12 on linux-gnu.

--2012-05-26 12:32:08--  http://127.0.0.1/
Connecting to 127.0.0.1:80... connected.
Created socket 3.
Releasing 0x09cdfb18 (new refcount 0).
Deleting unused 0x09cdfb18.

---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: 127.0.0.1
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Server: nginx/1.2.0
Date: Sat, 26 May 2012 04:32:08 GMT
Content-Type: text/html
Content-Length: 186
Last-Modified: Fri, 25 May 2012 02:41:59 GMT
Connection: keep-alive
Accept-Ranges: bytes

---response end---
200 OK
Registered socket 3 for persistent reuse.
Length: 186 1
Saving to: “index.html.42”

100%[================================================================>] 186         --.-K/s   in 0s      

2012-05-26 12:32:08 (4.72 MB/s) - “index.html.42” saved [186/186]

[[email protected] ~]#

如果wget不带–debug选项,则可以使用-S、–save-headers选项,不过此时只能查看响应头部信息:
-S
–server-response
Print the headers sent by HTTP servers and responses sent by FTP servers.

–save-headers
Save the headers sent by the HTTP server to the file, preceding the actual contents,
with an empty line as the separator.

实例:

[[email protected] ~]# wget -S 127.0.0.1
--2012-05-26 12:38:32--  http://127.0.0.1/
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Server: nginx/1.2.0
  Date: Sat, 26 May 2012 04:38:32 GMT
  Content-Type: text/html
  Content-Length: 186
  Last-Modified: Fri, 25 May 2012 02:41:59 GMT
  Connection: keep-alive
  Accept-Ranges: bytes
Length: 186 1
Saving to: “index.html.44”

100%[================================================================>] 186         --.-K/s   in 0s      

2012-05-26 12:38:32 (4.52 MB/s) - “index.html.44” saved [186/186]

[[email protected] ~]#

利用curl的-v查看请求响应头部信息:
-v/–verbose
Makes the fetching more verbose/talkative. Mostly useful for debugging. A line
starting with ’>’ means “header data” sent by curl, ’ < ’ means "header data"
received by curl that is hidden in normal cases, and a line starting with ’*’
means additional info provided by curl.

Note that if you only want HTTP headers in the output, -i/--include might be the
option you’re looking for.

If you think this option still doesn’t give you enough details, consider using
--trace or --trace-ascii instead.

This option overrides previous uses of --trace-ascii or --trace.

Use -s/--silent to make curl quiet.

实例(可以看到,wget链接请求默认采用的是HTTP/1.1协议):

[[email protected] aa]# curl -v 127.0.0.1
* About to connect() to 127.0.0.1 port 80 (#0)
*   Trying 127.0.0.1… connected
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (i686-pc-linux-gnu) libcurl/7.19.7 NSS/3.12.7.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
> Host: 127.0.0.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.2.0
< Date: Sat, 26 May 2012 04:45:12 GMT
< Content-Type: text/html
< Content-Length: 186
< Last-Modified: Fri, 25 May 2012 02:41:59 GMT
< Connection: keep-alive
< Accept-Ranges: bytes
<
<html>
<head>
<title>Welcome to nginx!</title>
</head>
<body bgcolor="white" text="black">
<center><h1>Welcome to nginx!</h1></center>
<center><h1>root:web</h1></center>
</body>
</html>
* Connection #0 to host 127.0.0.1 left intact
* Closing connection #0
[[email protected] aa]#

利用curl的-I选项仅查看响应头部信息:
-I/--head
(HTTP/FTP/FILE) Fetch the HTTP-header only! HTTP-servers feature the command HEAD
which this uses to get nothing but the header of a document. When used on a FTP
or FILE file, curl displays the file size and last modification time only.

实例:

[[email protected] aa]# curl -I 127.0.0.1
HTTP/1.1 200 OK
Server: nginx/1.2.0
Date: Sat, 26 May 2012 04:43:12 GMT
Content-Type: text/html
Content-Length: 186
Last-Modified: Fri, 25 May 2012 02:41:59 GMT
Connection: keep-alive
Accept-Ranges: bytes

[[email protected] aa]#

以上是关于Wget使用的主要内容,如果未能解决你的问题,请参考以下文章

RHSA-2017:3075-重要: wget 安全更新(代码执行)

wget(转)

NIH周三讲座视频爬虫

linux中wget的使用方法介绍

微信小程序代码片段

webstorm代码片段的创建