第二周,开始使用mongodb。这是我第一次使用非关系型数据库
最终成果
代码
from bs4 import BeautifulSoup
from pymongo import MongoClient
import requestsclient = MongoClient('localhost',27017)
pig = client['pig']
house = pig['house']
def save_info():
urls = ['http://bj.xiaozhu.com/search-duanzufang-p{}-0/'.format(str(i)) for i in range(0,4)]
for url in urls:
wb_data = requests.get(url)
if wb_data.status_code != 200:
print("http code:" + wb_data.status_code)
return
soup = BeautifulSoup(wb_data.text,'lxml')
titles = soup.select('span.result_title')
prices = soup.select('span.result_price > i')
for title,price in zip(titles,prices):
data = {
'title':title.get_text(),
'price':int(price.get_text()),
}
house.insert_one(data)
def lookfor():
for info in house.find():
if info['price'] >= 500:
print(info)
save_info()
lookfor()
总结
- mongodb 所占用硬盘内存600MB左右,安装好后使用命令mongod来进行配置。只是如果服务没有正常关闭,将会发生服务失效的情况。就要重新进行配置
- mongodb是属于非关系型数据库