只获取一个http请求返回的header, 使用curl:
xyz@bogon ~ $ curl -I www.baidu.com
HTTP/1.1 200 OK
Server: bfe/1.0.8.14
Date: Tue, 19 Jul 2016 03:55:19 GMT
Content-Type: text/html
Content-Length: 277
Last-Modified: Mon, 13 Jun 2016 02:50:02 GMT
Connection: Keep-Alive
ETag: "575e1f5a-115"
Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
Pragma: no-cache
Accept-Ranges: bytes
其实就是使用 HEAD 方式请求, 可以看下请求格式:
xyz@bogon ~ $ curl -v -I www.baidu.com
* Rebuilt URL to: www.baidu.com/
* Trying 61.135.169.121...
* Connected to www.baidu.com (61.135.169.121) port 80 (#0)
> HEAD / HTTP/1.1
> Host: www.baidu.com
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: bfe/1.0.8.14
Server: bfe/1.0.8.14
< Date: Tue, 19 Jul 2016 03:57:29 GMT
Date: Tue, 19 Jul 2016 03:57:29 GMT
< Content-Type: text/html
Content-Type: text/html
< Content-Length: 277
Content-Length: 277
< Last-Modified: Mon, 13 Jun 2016 02:50:04 GMT
Last-Modified: Mon, 13 Jun 2016 02:50:04 GMT
< Connection: Keep-Alive
Connection: Keep-Alive
< ETag: "575e1f5c-115"
ETag: "575e1f5c-115"
< Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
< Pragma: no-cache
Pragma: no-cache
< Accept-Ranges: bytes
Accept-Ranges: bytes
<
* Connection #0 to host www.baidu.com left intact
使用python的requests库
In [1]: import requests
In [2]: url = 'http://www.baidu.com/'
In [3]: r = requests.head(url)
In [4]: r.headers
Out[4]: {'Date': 'Tue, 19 Jul 2016 03:58:22 GMT', 'Cache-Control': 'private, no-cache, no-store, proxy-revalidate, no-transform', 'Server': 'bfe/1.0.8.14', 'Last-Modified': 'Mon, 13 Jun 2016 02:50:02 GMT', 'Content-Encoding': 'gzip', 'Pragma': 'no-cache', 'Connection': 'Keep-Alive', 'Content-Type': 'text/html'}
In [5]: r.text
Out[5]: ''