来一起看看《明天你好》的评论词云吧,利用Python爬取网易云音乐中相应歌曲的所有评论,并利用这些评论制作歌曲的专属词云。
《明天你好》评论词云
先来了解一下wordcloud个参数的意思
- background_color='white', # 背景颜色
- max_words=1000, # 最大词数
- width=800, # 词云图片的宽度,默认400像素
- height=600, # 词云图片的高度,默认200像素
- font_path='.\simhei.TTF', # 词云指定字体文件的完整路径
- mask=back_color, # 以该参数值作图绘制词云,这个参数不为空时,width和height会被忽略
- max_font_size=80, # 词云图中最大的字体字号
- min_font_size=20, # 词云图中最小的字体字号
- mask=mask, # 词云形状,默认None,即方形图
http://music.163.com/api/v1/resource/comments/R_SO_4_33756016?limit=20&offset=0
相关模块:
requests模块、jieba模块、wordcloud模块、以及一些Python自带的模块。
import requests
import wordcloud
import json
import jieba
import numpy as np
from PIL import Image
headers = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36'
,
'Referer':'https://music.163.com/'
}
def get_one_com(a):
res = requests.get('http://music.163.com/api/v1/resource/comments/R_SO_4_33756016?limit=20&offset=%d'%a,headers=headers)
data = res.json()
comments = []
for i in range(1,20):
comment = data['comments'][i]['content']
# print(comment)
comments.append(comment)
all_comments(comments,a)
def next_():
for a in range(0,420,20):
# print(a)
get_one_com(a)
def all_comments(comments,a):
content = ''.join('%s' % i for i in comments)
com_list.append(content)
# print(com_list)
if a == 400 :
wordcloud_comments(com_list)
def wordcloud_comments(comments):
content = ''.join('%s' % i for i in comments)
# print(content)
con_cut = jieba.lcut(content)
con_str = ' '.join(con_cut)
mask = np.array(Image.open('pix.png'))
w = wordcloud.WordCloud(background_color='white',font_path='msyhbd.ttc',mask=mask,max_words=300)
w.generate(con_str)
w.to_file("pywordcloud.png")
if __name__ == '__main__':
com_list = []
next_()