涉及到我自己是2.7.11,故选择用自己的理解和知识写了一段程序,没用到def,因为初学,不大会。在此@采蝶袖 作者,看了帖子很受启发。可能其中写的比较繁琐,还望作者指正,谢谢。
运行结果
附上源码:
# -*- coding: utf-8 -*-
#title:抓取某网页的所有链接并将链接编码后放入各个文件夹内
import re
import requests
import sys
import urllib2
from bs4 import Beautiful Soup
#定义页面链接,并抓取此页面的网页代码
reload(sys)
sys.setdefaultencoding('utf-8')
r = urllib2.Request("http://www.jianshu.com/")
content = urllib2.urlopen(r).read()
#print content
soup = BeautifulSoup(content,'html.parser')
link_list =re.findall(r'class="title" target="_blank" href="(/p.+?.{12})',content)
#开始循环
d =0
for i in link_list:
d = d +1
ii =str(d)
url ='http://www.jianshu.com'+i
r = requests.get(url)
data = r.text
soup_b = BeautifulSoup(data,'html.parser')
for x in soup_b.find_all('h1',class_="title"):
thistitle = x.text
f =file('d:/pythonWorkSpace/Python27PygamePy2exe-master/Python27PygamePy2exe-master/a/'+ ii +'.'+ thistitle +'.txt',"w")
for i in soup_b.find_all('div',class_='show-content'):
thisdata = i.text
f.write(thisdata)
break
f.close()
现在赶紧去学习def的写法。不喜勿喷。。。
网友评论