##### 通过urllib操作HTTP
- 普通方式
```
import urllib.request
response = urllib.request.urlopen('http://www.baidu.com')
html = response.read()
```
- Request方式
```
import urllib.request
req = urllib.request.Request('http://www.baidu.com')
response = urllib.request.urlopen(req)
res = response.read()
```
- 发送数据
```
import urllib.parse
import urllib.request
url = 'http://127.0.0.1:8881/l/consumer/save'
values = { 'name' : 'name', 'password' : '123456' }
data = urllib.parse.urlencode(values).encode(encoding='UTF8')
req = urllib.request.Request(url, data)
req.add_header('Referer', 'http://127.0.0.1')
response = urllib.request.urlopen(req)
result = response.read()
```
- 发送数据和header
```
import urllib.parse
import urllib.request
url = 'http://127.0.0.1:8881/l/consumer/save'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
values = { 'name' : 'name', 'password' : '123456' }
headers = { 'User-Agent' : user_agent,
"Cookie" : "ISMAPP_USER_AUTH_ID=8C92A2C3E19B9D7B28EE6554C3A4F61CE9648CFCFB843AAA;"}
data = urllib.parse.urlencode(values).encode(encoding='UTF8')
req = urllib.request.Request(url, data, headers)
response = urllib.request.urlopen(req)
result = response.read()
```
- 设置请求方式
> req.get_method = lambda: 'PUT'
>设置为put方式提交
```
.....
req = urllib.request.Request(url, data, headers)
req.get_method = lambda: 'PUT'
response = urllib.request.urlopen(req)
通过HTML标签获取元素内容
1、通过beautifulsoup(原则是通过标签的顺序,挨个挨个的往下找,比如:table>tfoot>等等)
2、通过jsonObj(如果返回的是json格式的话)