美文网首页大数据 爬虫Python AI SqlPython小哥哥
下百度文库要券?来用 Python 自动下!

下百度文库要券?来用 Python 自动下!

作者: 14e61d025165 | 来源:发表于2019-07-23 15:44 被阅读1次

    80行代码打造微信机器人实现下载百度文库

    <tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1563867790329" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image

    <input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>

    Python资源共享群:484031800

    简述

    生活当中免不了要下载百度文库,但是百度很恶心的是要下载券,收费呀啥的(这次的这个不能下载收费文档哦),所以我就在微信的搜索功能(非常强大)上搜怎么样免费下载百度文库,加了一些群,其中一个群就是有一个有专门的机器人,只要你往群里发链接,机器人自动回复你下载链接

    首先我在csdn上查阅相关资料群机器人的文章,明白Python关于群机器人有两大类,QQ群用qqbot库(基于smartqq),微信群用itchat库(基于微信网页版)。但是腾讯已经把smartQQ关了(也就 是不能通 过qqbot来制作QQ群机器人),只能建一个微信群机器人

    制作思路:

    1、通过itchat登录微信,实时监听所要建的群消息,假如有人发文库链接,将链接提取出来(比较简单)

    2、将所提取到的文库连接保存,然后请求下载网站,网站返回下载链接(这个还得要自己去抓包,去慢慢分析,我也是花了一整天才把他整个下载流程弄懂,很麻烦,今天我们重点讲这个,你们可以拿其他网站试试,亲测这个链接很长,我用到了百度的短连接,为了本帖不冗长,省去不讲)

    3、将网站返回的链接发送给相应的群,并@相应的人(比较简单)

    这个网站链接是:http://139.224.236.108/1.html(免费给他打了一波广告把,这个网站是收费的,下载的文档也是源文档,买账号也就几块钱,但是每个账号每天会限制下载次数,但是你可以多买几个账号,当一个被限制了之后,遍历其他账号,这样就OK了,其实下载文档这个不是经常下,但是有需求的时候没有下载券就麻烦了,为了兄弟们更好的复现,我把我买的账号分享给大家)

    详细

    <tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1563867790335" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image

    <input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>

    在百度文库上随便获取一个要下载券的文档链接,

    点击下载之后又是一波抓包,发送了请求post和nocode

    跳转到如下界面:

    <tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1563867790337" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image <input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image> <tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1563867790341" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image <input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image> <tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1563867790342" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image

    <input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>

    再点击这个下载,又发送了一个请求具体见下面的down(),就不再过多讲了直接上代码:

    <pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">import requests
    firsturl=input('请输入文档链接:')
    # 以下用到了两个链接,一个是查询文档ID的,另一个是下载的
    url1 = "http://139.224.236.108/post.php"
    url3 = "http://139.224.236.108/downdoc.php"
    # 将传入的文档链接进行转化
    downloadurl = firsturl.replace("/", "%2F").replace(":", "%3A")
    # head1查询文档ID的数据头
    # data1是查询的数据内容,其中将docinfo的值转化为链接
    # 查询得到结果,截取id的那一段并返回
    def query():
    head1 = {"POST": "/post.php HTTP/1.1",
    "Host": "139.224.236.108",
    "Content-Length": "145",
    "Accept": "/",
    "Origin": "http://139.224.236.108",
    "X-Requested-With": "XMLHttpRequest",
    "User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
    "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
    "Referer": "http://139.224.236.108/1.html",
    "Accept-Encoding": "gzip, deflate",
    "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
    "Cookie": "usrname=901961495; usrpwd=559448"
    }
    data1 = 'usrname=901961495&usrpass=559448&docinfo=downloadurl&taskid=up_down_doc1'
    data1 = data1.replace('downloadurl', downloadurl)
    respons = requests.post(url1, data=data1, headers=head1).json()
    id = respons['url']
    id = id[37:]
    return id
    id = query()
    # head3下载文档的数据头
    # data3是请求下载的数据内容,其中vid是查询内容返回的文档id值
    # 获取下载链接
    def down():
    Referer = "http://139.224.236.108/nocode.php?id={docid}"
    head3 = {"POST": "/downdoc.php HTTP/1.1",
    "Host": "139.224.236.108",
    "Content-Length": "54",
    "Accept": "/",
    "Origin": "http://139.224.236.108",
    "X-Requested-With": "XMLHttpRequest",
    "User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
    "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
    "Referer": Referer.format(docid=id),
    "Accept-Encoding": "gzip, deflate",
    "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
    "Cookie": "usrname=901961495; usrpwd=559448"
    }
    data3 = 'vid={docid}&taskid=directDown'
    data3 = data3.format(docid=id)
    response = requests.post(url3, data=data3, headers=head3).json()
    downurl = response["dlink"].replace("\", '')
    print(downurl)#点击这个URL,会自动下载文件哦
    return downurl
    query()
    down()
    </pre>

    上面可以实现一个独立的下载,但是如果我们要和微信联系起来,制作微信机器人,那么,我们可以把上面的query(),down(),弄成一个函数,传入一个链接,经过这个函数处理,返回下载地址,到时候直接调用这个函数即可

    接下来就是登陆微信,监控对应的群,放在服务器上,你就可以打造出24小时下载的机器人了。

    实现代码:

    <pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">@itchat.msg_register([itchat.content.TEXT], isGroupChat=True) # 注册一个群消息的处理
    def print_content(msg):
    if msg.User["NickName"] == qun:# 这里可以在后面加更多的or msg.User["NickName"]=='你希望自动回复群的名字
    if str(msg['Text'][0:5])=="https":#检测所发的消息是不是链接,是通过前5个
    huifubdwk= GET_SHORTURL(str(msg['Text']))#这个GET_SHORTURL是我上面所说的那个函数,我自己定义的
    print(msg.User['NickName'] + ":" + msg['Text'] ) # 打印哪个群给你发了什么消息
    print("%s+\n"%huifubdwk) # 打印机器人回复的消息
    itchat.send(u'@%s\u2005 %s' % (msg['ActualNickName'],huifubdwk), msg['FromUserName'])
    else:# 不是链接直接忽略
    print(msg['Text'])
    else:#不是相应群直接忽略
    pass
    </pre>

    好了,今天到这就结束了

    软件测试

    <tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1563867790357" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image

    <input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>

    只要把文库链接甩进去,机器人就会自动返回下载地址,爽歪歪。

    相关文章

      网友评论

        本文标题:下百度文库要券?来用 Python 自动下!

        本文链接:https://www.haomeiwen.com/subject/aaqalctx.html