爬取it之家新闻

作者: Gaolex | 来源:发表于2016-10-16 08:51 被阅读138次

    随意转载,注明出处

    • requests库 连接网络,处理http协议
    • beautifulsoup库 将网页变成结构化数据
    # -*- coding: utf-8 -*-
    from bs4 import BeautifulSoup
    import requests
    headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36 QIHU 360EE'}
    page = requests.get('http://www.ithome.com/',headers=headers)
    page.encoding = 'gb2312'
    soup =  BeautifulSoup(page.text,"html.parser")
    news = soup.find_all("span",class_="title")
    for new in news:
        a=new.a
        print(a.string)
    

    结果如图:

    爬虫结果

    相关文章

      网友评论

        本文标题:爬取it之家新闻

        本文链接:https://www.haomeiwen.com/subject/unrnyttx.html