美文网首页python_spider
爬取knewone上的信息

爬取knewone上的信息

作者: 宁静消失何如 | 来源:发表于2017-04-25 12:48 被阅读17次

爬取knewone上的信息
<pre>
author = 'LEE'

-- coding: utf-8 -

from bs4 import BeautifulSoup
import time
import requests
import time
import io
import sys
import urllib.request
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='gb18030')

url = 'https://knewone.com/things/?page='
def get_page(url,data=None):
wb_data = requests.get(url)
soup = BeautifulSoup(wb_data.text,'lxml')
imgs = soup.select('a.cover-inner > img')
titles =soup.select('section.content > h4 > a')
links = soup.select('section.content > h4 > a')
#print(soup)
if data ==None:
for img,title,link in zip(imgs,titles,links):
data = {
'img':img.get('src'),
'title':title.get('title'),
'link':link.get('href'),
}
print(data)
def get_more_page(start,end):
for one in range(start,end):
get_page(url+str(one))
time.sleep(1)

get_more_page(1,10)

</pre>

相关文章

网友评论

    本文标题:爬取knewone上的信息

    本文链接:https://www.haomeiwen.com/subject/rfrvzttx.html