API接口:
http://comment.bilibili.com/72036817.xml
https://api.bilibili.com/x/v1/dm/list.so?oid=9931722
数字是av号
但不是全部弹幕,只有一千条
from bs4 import BeautifulSoup
import pandas as pd
import requests
url = 'http://comment.bilibili.com/72036817.xml'
html = requests.get(url).content
html_data = str(html, 'utf-8')
soup = BeautifulSoup(html_data, 'lxml')
results = soup.find_all('d')
comments = [comment.text for comment in results]
comments_dict = {'comments': comments}
df = pd.DataFrame(comments_dict)
df.to_csv('bilibili.csv', encoding='utf-8')
网友评论