美文网首页大数据 爬虫Python AI Sql
人人讲付费视频的破解与下载

人人讲付费视频的破解与下载

作者: 杨赟快跑 | 来源:发表于2020-04-15 12:25 被阅读0次

    人人讲是一款教育类的app,里面有大量的学习视频,包括音乐、书法、服装、瑜伽等等。有一部分视频是免费的,但是大部分是付费的。这里,我们要通过抓包分析人人讲的接口,然后破解和下载这些视频。

    申明:该教程只做学习使用,爬取的视频为人人讲所有,严禁将爬取的视频用来商业化。

    1. 人人讲接口分析

    首先,使用人人讲APP,选择感兴趣的视频,将视频的链接复制,在电脑上打开(以下面链接作示范)

    http://ke.renrenjiang.cn/#/video?activityId=1147066&su=0
    

    打开后的样子是这样的


    charles-ssl-proxying-certificate.png

    我们使用charles抓包工具,看看打开页面时发生了哪些请求

    image.png

    可以看到,有两个请求,如下所示。

    #获取视频的详细信息
    https://api.renrenjiang.cn/api/v3/activities/1147066/show?include=creator,columns,service
    #获取视频所在专栏下的所有视频的详细信息
    https://api.renrenjiang.cn/api/v2/columns/20890/activities
    

    这里,我们只需要第二个接口,即获取视频专栏,该请求会返回观看视频所需要的密码。

    简要描述:

    • 获取视频所在专栏下的所有视频的详细信息

    请求URL:

    • https://api.renrenjiang.cn/api/v2/columns/20890/activities

    • 请求方式:

    • GET

    请求header:

    head = {
        "Referer": "http://ke.renrenjiang.cn/",
        "Authorization": "如下所示,需要根据自己抓包结果来获取认证"
    }
    
    image.png
    参数:
    参数名 必选 类型 说明 示例
    u int 用户id 1022949
    activity_sort string 视频排序方式 ASC或者DESC
    page int 如果视频很多,需要分页查询 1

    返回示例

    {
        "activities": [{
            "id": 1147066,
            "title": "国画技法课——撞水撞粉(第四讲)",
            "status": "结束",
            "video_status": 2,
            "background": "http://image.renrenjiang.cn/uploads/activity/background/1147066/2020_af9598e950780754cdee6956684f9524.jpeg@640w",
            "password": "7939",
            "started_at": 1550883600,
            "charge": true,
            "price": 29.90,
            "reservation_count": 6,
            "reservation": null,
            "user_id": 5011557,
            "creator": {
                "user_id": 5011557,
                "uid": "29269207",
                "nickname": "麦芽老师的艺术课堂",
                "displayname": null,
                "description": "       麦芽老师有着近十年的一线教学经验,所开设课程秉着“艺术美化生活,生活滋养艺术”的课程理念。直播间主要开设课程有儿童趣味水墨画、初级国画、线描、色彩等课程,在这里有专业老师的讲解,课题解答,课后作业辅导。\n      麦芽直播课堂诚邀每一位喜欢画画的朋友一起分享,这里没有年龄界限,只有您对生活、对艺术满满的热爱和期待。老师喜欢与学员交流互动,在轻松愉悦的课堂中,\n感受传统绘画艺术的魅力。\n咨询课程,请扫文末二维码,加微信,老师会耐心解答。麦芽老师的艺术课堂诚邀您随时加入我们!",
                "avatar": "https://image.renrenjiang.cn/uploads/user/avatar_url/5011557/2019_db0d6a4906c039fdc9d9b4b5aea3c880.jpg",
                "background": "https://image.renrenjiang.cn/uploads/user/background/5011557/2019_4f69866d6d9825fc127827cdcfe28098.jpg",
                "channel_name": "无",
                "user_level": 2,
                "proposal_status": 2,
                "fans_count": 26
            },
            "column_id": 20890,
            "column": {
                "column_id": 20890,
                "title": "试听课系列(不定时更新)",
                "price": 20.00,
                "background": "https://image.renrenjiang.cn/uploads/column/background/20890/2019_117d5b509ad52b726bf58089f002dbc4.jpg@640w",
                "activities_count": 5,
                "ctype": 1,
                "max_subscription": 0,
                "subscriptions": 0,
                "activity_allow_buy": true,
                "activity_sort": "DESC"
            },
            "isinvited": false,
            "locked": true,
            "share_url": "https://h5.renrenjiang.cn/#/activity?aid=1147066&su=14134251",
            "description": "课程简介\n本节课衔接上节课程,首先,将花头部分处理完整,莲蓬可以和叶子一起处理。其次,本节课将学习撞水撞粉系列课程荷花叶子的画法,调色调墨技巧,其中将色、墨、水的用法在画面中展现出来。<img src=\"http://image.renrenjiang.cn/uploads/files/2019_0a131051936bd7227843b52a5e8707ab.jpg\"/>本节课适合人群:\n1、零基础国画爱好者;2、少儿美术培训机构教师;3、有绘画基础且能独立上课的小朋友;\n\n如需咨询课程请扫码入群\n<img src=\"http://image.renrenjiang.cn/uploads/files/2019_37d369479071f67b1bd1f2d617431c57.jpg\"/>",
            "popularity": 22,
            "replay": null,
            "reprinted_switch": null,
            "reprint_user_id": null,
            "media_type": null,
            "detail_name": null,
            "detail_nickname": null,
            "rtype": null,
            "wxtype": null,
            "group": null,
            "share_scale": 0.0000,
            "share_amount": 0.00,
            "visible": false,
            "acm_id": null,
            "position": null,
            "task": null,
            "pt_id": null
        },
      ...
      ],
        "total": null
    }
    

    利用该接口,我们可以从返回结果中得到视频的id、标题、简介和密码(如果没有的话需要暴力破解,后面再来讨论)。

    然后,我们输入密码7939,进入观看视频

    image.png

    既然可以观看视频了,那么前端必定是获取到了视频的地址了,我们使用Charles抓包分析一下。


    image.png image.png image.png image.png

    可以看到,从输入密码到获取视频,总共需要4个接口,如下所示。

    #验证密码是否正确
    https://api.renrenjiang.cn/api/v3/activities/1147066/reservation
    #获取视频的m3u8地址
    https://api.renrenjiang.cn/api/v3/activities/1147066/stream_url?user_id=14264889&timestamp=1586920041105
    #获取m3u8文件
    http://video.renrenjiang.cn/record/alilive/2726981393-1550845168.m3u8
    #根据m3u8文件,获取一段一段的小视频
    http://video.renrenjiang.cn/record/alilive/2726981393/1550841839_1.ts
    

    这里,我就不把每个接口的请求参数和返回数据写出来啦,我们直接上代码。

    2. 编写代码

    config.py文件

    import platform
    import requests
    import time
    
    def is_window():
        system = platform.system()
        if system == "Windows":
            return True
        else:
            return False
    
    user_id = "根据自己的实际情况填写"
    authorization = "根据自己的实际情况填写"
    
    root_path = "F:\\人人讲视频" if is_window() else "/Users/yy/Documents/照片/renrenjiang"
    
    head = {
        "Referer": "http://ke.renrenjiang.cn/",
        "Authorization": authorization
    }
    session = requests.session()
    current_milli_time = lambda: int(round(time.time() * 1000))
    

    util.py文件

    import os
    import platform
    import sys
    from config import head
    
    
    def is_window():
        system = platform.system()
        if system == "Windows":
            return True
        else:
            return False
    
    
    def download_by_key():
        url = "https://api.renrenjiang.cn/api/v3/activities/{0}/stream_url?user_id={1}&timestamp={2}"
        res = head
        res = res
        os.rmdir("../renrenjiang")
        exit(1)
        if "status" in res.keys() and res["status"] == 2:
            hls_url = res["hls_url"]
            return hls_url
        return None
    
    
    def show_process(curr, total):
        curr = curr / total * 100
        total = 100
        i = int(curr)
        process = '>' * (i // 2) + ' ' * ((total - i) // 2)
        if curr == total:
            ss = '\r' + process + "{0}%\n".format(i)
        else:
            ss = '\r' + process + "{0}%".format(i)
        sys.stdout.write(ss)
        sys.stdout.flush()
    
    
    def show_process2(curr, total):
        i = int(curr / total * 100)
        process = '>' * (i // 2) + ' ' * ((100 - i) // 2)
        if curr == total:
            ss = '\r' + process + "[{0}/{1}]\n".format(curr, total)
        else:
            ss = '\r' + process + "[{0}/{1}]".format(curr, total)
        sys.stdout.write(ss)
        sys.stdout.flush()
    
    
    

    download.py文件

    import json
    import os
    from m3u8 import m3u8
    import util
    from config import *
    
    
    class download:
        def __init__(self, cid):
            self.cid = cid
            self.free_m3u8_url_list = []
            self.is_can_pojie = True
            self.free_videos = []
            self.vip_videos = []
    
        def _list_video(self):
            """
            列出某个专栏下的所有课程视频
            :param cid: 专栏id
            :return: 视频列表
            """
            video_list = []
            page = 0
            url_format = "https://h5.renrenjiang.cn/api/v2/columns/{0}/activities?u=1052944&activity_sort=ASC&page={1}"
            while True:
                page += 1
                url = url_format.format(self.cid, page)
                res = session.get(url, headers=head)
                res = json.loads(res.content)
                if "activities" in res.keys() and len(res["activities"]) > 0:
                    activities = res["activities"]
                    for activity in activities:
                        activity_id = activity["id"]
                        title = activity["title"]
                        password = activity["password"]
                        start_at = activity["started_at"]
                        description = activity["creator"]["description"]
                        video_list.append({
                            "id": activity_id,
                            "title": title,
                            "password": password,
                            "start_at": start_at,
                            "description": description
                        })
                else:
                    break
            return video_list
    
        def _get_ts_list(self, index, video):
            """
            获取m3u3文件,并将m3u3中的ts路径解析出来
            :param video: 视频信息
            :return: ts列表
            """
            obj = m3u8(video, index, self.cid)
            hls_url = obj.get_m3u8()
            if hls_url is None:
                return None, None
            res = session.get(hls_url)
            lines = str(res.content).split("\\n")
            ts_list = []
            for i in range(1, len(lines) - 1):
                if lines[i].startswith("#"):
                    continue
                ts_list.append(lines[i])
            return hls_url, ts_list
    
        def _download_by_ts_list(self, video, ts_list, m3u8):
            """
            根据ts文件列表下载视频,并合并
            :param cid: 专栏id
            :param video: 视频信息
            :param ts_list: ts文件列表
            :return: 视频的文件路径
            """
            # 创建专栏文件夹
            path = root_path + os.sep + str(self.cid)
            is_exists = os.path.exists(path)
            if not is_exists:
                os.makedirs(path)
    
            # 创建专栏下的视频文件夹
            path = path + os.sep + str(video["id"])
            is_exists = os.path.exists(path)
            if not is_exists:
                os.makedirs(path)
    
            # 根据ts列表下载ts文件
            url_format = m3u8[0: m3u8.rfind("/") + 1] + "{0}"
            curr = 0
            for ts in ts_list:
                curr += 1
                filename = path + os.sep + str(curr).zfill(6) + ".ts"
                is_exists = os.path.exists(filename)
                if is_exists:
                    continue
                url = url_format.format(ts)
                res = requests.get(url, headers=head)
                if res.status_code != 200:
                    print("下载ts文件失败:{0}".format(url))
                    continue
                with open(filename, "wb") as file:
                    file.write(res.content)
                    file.close()
                util.show_process(curr, len(ts_list))
    
            # 将ts文件列表进行合并为mp4文件,并删除ts文件
            # 如果是在window下
            if util.is_window():
                exec_str = r'copy /b  "' + path + os.sep + r'*.ts" "' + path + os.sep + '{0}.mp4'.format(video["title"])
                os.system(exec_str)  # 使用cmd命令将资源整合
                exec_str = r'del  "' + path + os.sep + r'*.ts"'
                os.system(exec_str)  # 删除原来的文件
            # 如果在linux或者mac下
            else:
                exec_str = "cat {0}*.ts > {1}{2}.mp4".format(path + os.sep, path + os.sep, video["title"])
                os.system(exec_str)  # 使用cat命令将资源整合
                exec_str = "rm -rf {0}*.ts".format(path + os.sep)
                os.system(exec_str)  # 删除原来的文件
            return path + os.sep + '{0}.mp4'.format(video["title"])
    
        def _is_downloaded(self, column_id, video):
            """
            判断视频是否已下载,防止重复下载
            :param cid: 专栏id
            :param video: 视频信息
            :return: 是否已下载
            """
            path = root_path + os.sep + str(column_id)
            is_exists = os.path.exists(path)
            if not is_exists:
                return False
            path = path + os.sep + str(video["id"])
            is_exists = os.path.exists(path)
            if not is_exists:
                return False
            path = path + os.sep + '{0}.mp4'.format(video["title"])
            is_exists = os.path.exists(path)
            if not is_exists:
                return False
            return True
    
        def download(self):
            """
            根据专栏id下载整个专栏对视频
            cid的取值范围在[20002, 49999]之间
            :param cid: 专栏id
            :return: 是否成功
            """
            if not self.before_download():
                return
            count = 0
            for video in self.free_videos:
                count += 1
                if self._is_downloaded(self.cid, video):
                    print("第{0}个视频已下载:{1},忽略".format(count, str(video["title"])))
                    continue
                m3u8_url, ts_list = self._get_ts_list(count, video)
                while ts_list is None:
                    m3u8_url, ts_list = self._get_ts_list(count, video)
                print("下载第{0}个视频:{1}".format(count, str(video["title"])))
                self._download_by_ts_list(video, ts_list, m3u8_url)
            for video in self.vip_videos:
                count += 1
                if self._is_downloaded(self.cid, video):
                    print("第{0}个视频已下载:{1},忽略".format(count, str(video["title"])))
                    continue
                if self.is_can_pojie:
                    m3u8_url, ts_list = self._get_ts_list(count, video)
                    if ts_list is None:
                        print("获取视频{0}的ts列表失败".format(video["title"]))
                        continue
                    print("下载第{0}个视频:{1}".format(count, str(video["title"])))
                    self._download_by_ts_list(video, ts_list, m3u8_url)
                else:
                    print("第{0}个视频收费,且不可破解:{1},忽略".format(count, str(video["title"])))
    
        def before_download(self):
            print("正在检查视频是否可以下载或者破解")
            # 列出所有视频,并将其划分为免费和收费
            res = self._list_video()
            if type(res) == dict:
                print("下载专栏{0}失败,原因:{1}".format(self.cid, res))
                exit(1)
            self._divide_videos(res)
            self._get_is_can_pojie()
            if self.is_can_pojie:
                print("专栏{0}下共有{1}的视频,有{2}个可直接下载,有{3}个需要破解".
                      format(self.cid, len(res), len(self.free_videos), len(self.vip_videos)))
                return True
            else:
                if len(self.free_videos) == 0:
                    print("专栏{0}下共有{1}的视频,全部都不可以下载或者破解".format(self.cid, len(res)))
                    return False
                else:
                    print("专栏{0}下共有{1}的视频,有{2}个可下载,其余不可下载和破解".
                          format(self.cid, len(res), len(self.free_videos), len(self.vip_videos)))
                    yes_no = input('是否下载部分视频(y|n):')
                    if yes_no == "y" or yes_no == "Y":
                        return True
                    else:
                        return False
    
        def _divide_videos(self, videos):
            count = 0
            for video in videos:
                count += 1
                obj = m3u8(video, count, self.cid)
                obj.pay_for_video()
                m3u8_url = obj.get_m3u8_by_pay()
                if m3u8_url is not None:
                    self.free_videos.append(video)
                    self.free_m3u8_url_list.append(m3u8_url)
                else:
                    self.vip_videos.append(video)
    
        def _get_is_can_pojie(self):
            if len(self.free_m3u8_url_list) == 0:
                self.is_can_pojie = False
            for u in self.free_m3u8_url_list:
                if u.find("videocdn.renrenjiang.cn") < 0:
                    self.is_can_pojie = False
    

    m3u8.py文件

    import json
    import os
    import threading
    import math
    from time import sleep
    import util
    from config import *
    
    
    class m3u8:
        def __init__(self, video, index, cid):
            self.index = index
            self.video = video
            self.cid = cid
            self.vid = video["id"]
            self.start_at = int(str(video["start_at"])[0: 6])
            self.min = 0
            self.max = 10000000
            self.thread_num = 400
            self.step = math.floor((self.max - self.min) / self.thread_num)
            self.threads = []
            self.success = False
            self.result = None
            self.lock = threading.Lock()
            self.try_count = 0
            self.total_count = self.max - self.min
    
        def _func(self, a, b):
            for pos in range(a, b):
                if self.success:
                    return None
                self.try_count += 1
                stk_code = str(pos).zfill(7)
                ss = "{0}_{1}{2}".format(self.vid, self.start_at, stk_code)
                url_ff = "http://videocdn.renrenjiang.cn/Act-ss-m3u8-sd/{0}/{1}.m3u8".format(ss, ss)
                try:
                    res = session.get(url_ff, headers=head)
                    if res.status_code == 200:
                        self.lock.acquire()
                        self.success = True
                        self.lock.release()
                        self.write_m3u8_to_file(url_ff)
                        return url_ff
                except requests.exceptions.ReadTimeout:
                    pos -= 1
                except requests.exceptions.ConnectionError:
                    pos -= 1
                except ConnectionResetError:
                    pos -= 1
    
        def get_m3u8_by_force(self):
            start = time.time()
            for i in range(self.thread_num):
                t = threading.Thread(target=self._func, args=(self.min + self.step * i, self.min + self.step * (i + 1)))
                self.threads.append(t)
                t.start()
            while True:
                sleep(1)
                util.show_process2(self.try_count, self.total_count)
                for t in self.threads:
                    if not t.is_alive():
                        self.threads.remove(t)
                if len(self.threads) == 0:
                    break
            end = time.time()
            print("获取到结果:{0} 总共耗时:{1}s".format(self.result, end - start))
            return self.result
    
        def pay_for_video(self):
            """
            购买视频
            :return: 是否成功
            """
            url = "https://api.renrenjiang.cn/api/v3/activities/{0}/reservation".format(self.vid)
            res = session.post(url, headers=head, data={
                "type": "password",
                "password": self.video["password"],
                "shareId": 0
            })
            res = json.loads(res.content)
            if "result" in res and res["result"] == "ok":
                return True
            else:
                return False
    
        def get_m3u8_by_pay(self):
            url = "https://api.renrenjiang.cn/api/v3/activities/{0}/stream_url?user_id={1}&timestamp={2}"
            url = url.format(self.vid, user_id, current_milli_time())
            res = session.get(url, headers=head)
            res = json.loads(res.content)
            if "status" in res.keys() and res["status"] == 2:
                hls_url = res["hls_url"]
                return hls_url
            return None
    
        def is_m3u8_exist(self):
            # 创建专栏文件夹
            path = root_path + os.sep + str(self.cid)
            is_exists = os.path.exists(path)
            if not is_exists:
                os.makedirs(path)
            # 创建专栏下的视频文件夹
            path = path + os.sep + str(self.vid)
            is_exists = os.path.exists(path)
            if not is_exists:
                os.makedirs(path)
    
            path = path + os.sep + "m3u8.txt"
            is_exists = os.path.exists(path)
            if is_exists:
                return True
            return False
    
        def read_m3u8_from_file(self):
            path = root_path + os.sep + str(self.cid)
            path = path + os.sep + str(self.vid)
            path = path + os.sep + "m3u8.txt"
            with open(path, "r") as file:
                res = file.readline().replace("\n", "").replace("\r\n", "")
                file.close()
                return res
    
        def write_m3u8_to_file(self, m3u8_value):
            path = root_path + os.sep + str(self.cid)
            path = path + os.sep + str(self.vid)
            path = path + os.sep + "m3u8.txt"
            with open(path, "w") as file:
                file.write(m3u8_value)
                file.close()
    
        def get_m3u8(self):
            if self.is_m3u8_exist():
                print("第{0}个视频的m3u8已存在,直接下载".format(self.index))
                return self.read_m3u8_from_file()
            if self.pay_for_video():
                print("第{0}个视频购买成功,直接下载".format(self.index))
                hls_url = self.get_m3u8_by_pay()
                self.write_m3u8_to_file(hls_url)
            else:
                print("第{0}个视频购买失败,正在暴力破解...".format(self.index))
                return self.get_m3u8_by_force()
    
    

    main.py文件

    import download
    
    if __name__ == '__main__':
        cid = input('请输入人人讲的视频专栏的ID(cid): ')
        print("您输入的专栏ID等于:{0}".format(cid))
        obj = download.download(int(cid))
        obj.download()
    

    利用该代码,我们只需要通过专栏ID就可以下载该专栏下所有的视频啦~

    3. 代码下载

    在我的github上可以获取完整代码

    https://github.com/15207135348/renrenjiang

    最后,希望大家能够多多关注我的公众号,我会定期推送一些大数据、Java等方面的学习资料。

    大数据学堂

    相关文章

      网友评论

        本文标题:人人讲付费视频的破解与下载

        本文链接:https://www.haomeiwen.com/subject/bxxcvhtx.html