爬虫练习:爬取电影天堂下载链接
作者:
孤独唯心 | 来源:发表于
2018-07-23 09:09 被阅读0次
import requests
import regex
for m in range(31):
url ='https://www.dy2018.com/html/gndy/dyzz/index_'+str(m)+'.html'
html = requests.get(url)
html.encoding ='gb2312' #设定网页编码类型
data = regex.findall('<a href="(.*?)" class="ulink"', html.text)
# print(data)
for n in data:
url2 ='https://www.dy2018.com'+n
html2 = requests.get(url2)
html2.encoding ='gb2312'
ftp = regex.findall('<a href="(.*?)">.*?</a></td>',html2.text)
# print(ftp)
#写入文件
with open(r'C:\Users\Administrator\Desktop\dy\dytt.txt', 'a',encoding='gb2312') as f:
f.write(ftp[0]+'\n')
本文标题:爬虫练习:爬取电影天堂下载链接
本文链接:https://www.haomeiwen.com/subject/yhjdmftx.html
网友评论