![](https://img.haomeiwen.com/i2245653/105251053580b6b6.jpg)
一、缘起
每次转载公众号文章,下载原文封面总是个烦心事。
虽然已有现成方案如下:
1)右键查看源代码:
![](https://img.haomeiwen.com/i2245653/c889039b9f1ca720.jpg)
2)Ctrl + F 搜索 “msg_cdn_url”:
![](https://img.haomeiwen.com/i2245653/3139b5415b426a4a.jpg)
其后双引号中内容即为封面图网址,复制此网址后在浏览器中打开即可另存。
但毕竟过于繁琐,另外记忆搜索词也无意义,于是写了个一键下载工具分享给大家,已打包为 exe 文件,无需安装 Python 即可正常使用,关注公号:风巢(wind-nest),后台回复 20180618 下载。
二、方案
1、双击工具图标,弹出对话框
![](https://img.haomeiwen.com/i2245653/b868eaf1fd51c5af.jpg)
2、粘贴公众号文章链接并回车
![](https://img.haomeiwen.com/i2245653/883d718b0aef2e1f.jpg)
3、文章封面自动下载至此文件夹,且以 “文章名称 - 封面” 模式命名。
![](https://img.haomeiwen.com/i2245653/36c0959767e43ac2.jpg)
三、源码
import requests
import re
import urllib.request
def getHTMLText(url):
try:
r = requests.get(url, timeout=30)
r.raise_for_status()
r.encoding = 'utf-8'
return r.text
except:
return ""
def articleTitle(html):
pattern1 = re.compile('var msg_title = ".*"')
result1 = pattern1.findall(html)[0]
return result1.lstrip('var msg_title = "').rstrip('"')
def imgURL(html):
pattern1 = re.compile('var msg_cdn_url = ".*=jpeg"')
result1 = pattern1.findall(html)[0]
pattern2 = re.compile('http.*jpeg')
result2 = pattern2.findall(result1)[0]
return result2
def main():
url = input("请粘贴要下载封面的公众号文章链接:\n")
html = getHTMLText(url)
img = urllib.request.urlopen(imgURL(html)).read()
with open(articleTitle(html) + ' - 封面.jpg', 'wb') as f:
f.write(img)
main()
网友评论