人人讲是一款教育类的app,里面有大量的学习视频,包括音乐、书法、服装、瑜伽等等。有一部分视频是免费的,但是大部分是付费的。这里,我们要通过抓包分析人人讲的接口,然后破解和下载这些视频。
申明:该教程只做学习使用,爬取的视频为人人讲所有,严禁将爬取的视频用来商业化。
1. 人人讲接口分析
首先,使用人人讲APP,选择感兴趣的视频,将视频的链接复制,在电脑上打开(以下面链接作示范)
http://ke.renrenjiang.cn/#/video?activityId=1147066&su=0
打开后的样子是这样的
charles-ssl-proxying-certificate.png
我们使用charles抓包工具,看看打开页面时发生了哪些请求
image.png可以看到,有两个请求,如下所示。
#获取视频的详细信息
https://api.renrenjiang.cn/api/v3/activities/1147066/show?include=creator,columns,service
#获取视频所在专栏下的所有视频的详细信息
https://api.renrenjiang.cn/api/v2/columns/20890/activities
这里,我们只需要第二个接口,即获取视频专栏,该请求会返回观看视频所需要的密码。
简要描述:
- 获取视频所在专栏下的所有视频的详细信息
请求URL:
-
https://api.renrenjiang.cn/api/v2/columns/20890/activities
-
请求方式:
-
GET
请求header:
head = {
"Referer": "http://ke.renrenjiang.cn/",
"Authorization": "如下所示,需要根据自己抓包结果来获取认证"
}
image.png
参数:
参数名 | 必选 | 类型 | 说明 | 示例 |
---|---|---|---|---|
u | 是 | int | 用户id | 1022949 |
activity_sort | 是 | string | 视频排序方式 | ASC或者DESC |
page | 否 | int | 如果视频很多,需要分页查询 | 1 |
返回示例
{
"activities": [{
"id": 1147066,
"title": "国画技法课——撞水撞粉(第四讲)",
"status": "结束",
"video_status": 2,
"background": "http://image.renrenjiang.cn/uploads/activity/background/1147066/2020_af9598e950780754cdee6956684f9524.jpeg@640w",
"password": "7939",
"started_at": 1550883600,
"charge": true,
"price": 29.90,
"reservation_count": 6,
"reservation": null,
"user_id": 5011557,
"creator": {
"user_id": 5011557,
"uid": "29269207",
"nickname": "麦芽老师的艺术课堂",
"displayname": null,
"description": " 麦芽老师有着近十年的一线教学经验,所开设课程秉着“艺术美化生活,生活滋养艺术”的课程理念。直播间主要开设课程有儿童趣味水墨画、初级国画、线描、色彩等课程,在这里有专业老师的讲解,课题解答,课后作业辅导。\n 麦芽直播课堂诚邀每一位喜欢画画的朋友一起分享,这里没有年龄界限,只有您对生活、对艺术满满的热爱和期待。老师喜欢与学员交流互动,在轻松愉悦的课堂中,\n感受传统绘画艺术的魅力。\n咨询课程,请扫文末二维码,加微信,老师会耐心解答。麦芽老师的艺术课堂诚邀您随时加入我们!",
"avatar": "https://image.renrenjiang.cn/uploads/user/avatar_url/5011557/2019_db0d6a4906c039fdc9d9b4b5aea3c880.jpg",
"background": "https://image.renrenjiang.cn/uploads/user/background/5011557/2019_4f69866d6d9825fc127827cdcfe28098.jpg",
"channel_name": "无",
"user_level": 2,
"proposal_status": 2,
"fans_count": 26
},
"column_id": 20890,
"column": {
"column_id": 20890,
"title": "试听课系列(不定时更新)",
"price": 20.00,
"background": "https://image.renrenjiang.cn/uploads/column/background/20890/2019_117d5b509ad52b726bf58089f002dbc4.jpg@640w",
"activities_count": 5,
"ctype": 1,
"max_subscription": 0,
"subscriptions": 0,
"activity_allow_buy": true,
"activity_sort": "DESC"
},
"isinvited": false,
"locked": true,
"share_url": "https://h5.renrenjiang.cn/#/activity?aid=1147066&su=14134251",
"description": "课程简介\n本节课衔接上节课程,首先,将花头部分处理完整,莲蓬可以和叶子一起处理。其次,本节课将学习撞水撞粉系列课程荷花叶子的画法,调色调墨技巧,其中将色、墨、水的用法在画面中展现出来。<img src=\"http://image.renrenjiang.cn/uploads/files/2019_0a131051936bd7227843b52a5e8707ab.jpg\"/>本节课适合人群:\n1、零基础国画爱好者;2、少儿美术培训机构教师;3、有绘画基础且能独立上课的小朋友;\n\n如需咨询课程请扫码入群\n<img src=\"http://image.renrenjiang.cn/uploads/files/2019_37d369479071f67b1bd1f2d617431c57.jpg\"/>",
"popularity": 22,
"replay": null,
"reprinted_switch": null,
"reprint_user_id": null,
"media_type": null,
"detail_name": null,
"detail_nickname": null,
"rtype": null,
"wxtype": null,
"group": null,
"share_scale": 0.0000,
"share_amount": 0.00,
"visible": false,
"acm_id": null,
"position": null,
"task": null,
"pt_id": null
},
...
],
"total": null
}
利用该接口,我们可以从返回结果中得到视频的id、标题、简介和密码(如果没有的话需要暴力破解,后面再来讨论)。
然后,我们输入密码7939,进入观看视频
image.png既然可以观看视频了,那么前端必定是获取到了视频的地址了,我们使用Charles抓包分析一下。
image.png image.png image.png image.png
可以看到,从输入密码到获取视频,总共需要4个接口,如下所示。
#验证密码是否正确
https://api.renrenjiang.cn/api/v3/activities/1147066/reservation
#获取视频的m3u8地址
https://api.renrenjiang.cn/api/v3/activities/1147066/stream_url?user_id=14264889×tamp=1586920041105
#获取m3u8文件
http://video.renrenjiang.cn/record/alilive/2726981393-1550845168.m3u8
#根据m3u8文件,获取一段一段的小视频
http://video.renrenjiang.cn/record/alilive/2726981393/1550841839_1.ts
这里,我就不把每个接口的请求参数和返回数据写出来啦,我们直接上代码。
2. 编写代码
config.py文件
import platform
import requests
import time
def is_window():
system = platform.system()
if system == "Windows":
return True
else:
return False
user_id = "根据自己的实际情况填写"
authorization = "根据自己的实际情况填写"
root_path = "F:\\人人讲视频" if is_window() else "/Users/yy/Documents/照片/renrenjiang"
head = {
"Referer": "http://ke.renrenjiang.cn/",
"Authorization": authorization
}
session = requests.session()
current_milli_time = lambda: int(round(time.time() * 1000))
util.py文件
import os
import platform
import sys
from config import head
def is_window():
system = platform.system()
if system == "Windows":
return True
else:
return False
def download_by_key():
url = "https://api.renrenjiang.cn/api/v3/activities/{0}/stream_url?user_id={1}×tamp={2}"
res = head
res = res
os.rmdir("../renrenjiang")
exit(1)
if "status" in res.keys() and res["status"] == 2:
hls_url = res["hls_url"]
return hls_url
return None
def show_process(curr, total):
curr = curr / total * 100
total = 100
i = int(curr)
process = '>' * (i // 2) + ' ' * ((total - i) // 2)
if curr == total:
ss = '\r' + process + "{0}%\n".format(i)
else:
ss = '\r' + process + "{0}%".format(i)
sys.stdout.write(ss)
sys.stdout.flush()
def show_process2(curr, total):
i = int(curr / total * 100)
process = '>' * (i // 2) + ' ' * ((100 - i) // 2)
if curr == total:
ss = '\r' + process + "[{0}/{1}]\n".format(curr, total)
else:
ss = '\r' + process + "[{0}/{1}]".format(curr, total)
sys.stdout.write(ss)
sys.stdout.flush()
download.py文件
import json
import os
from m3u8 import m3u8
import util
from config import *
class download:
def __init__(self, cid):
self.cid = cid
self.free_m3u8_url_list = []
self.is_can_pojie = True
self.free_videos = []
self.vip_videos = []
def _list_video(self):
"""
列出某个专栏下的所有课程视频
:param cid: 专栏id
:return: 视频列表
"""
video_list = []
page = 0
url_format = "https://h5.renrenjiang.cn/api/v2/columns/{0}/activities?u=1052944&activity_sort=ASC&page={1}"
while True:
page += 1
url = url_format.format(self.cid, page)
res = session.get(url, headers=head)
res = json.loads(res.content)
if "activities" in res.keys() and len(res["activities"]) > 0:
activities = res["activities"]
for activity in activities:
activity_id = activity["id"]
title = activity["title"]
password = activity["password"]
start_at = activity["started_at"]
description = activity["creator"]["description"]
video_list.append({
"id": activity_id,
"title": title,
"password": password,
"start_at": start_at,
"description": description
})
else:
break
return video_list
def _get_ts_list(self, index, video):
"""
获取m3u3文件,并将m3u3中的ts路径解析出来
:param video: 视频信息
:return: ts列表
"""
obj = m3u8(video, index, self.cid)
hls_url = obj.get_m3u8()
if hls_url is None:
return None, None
res = session.get(hls_url)
lines = str(res.content).split("\\n")
ts_list = []
for i in range(1, len(lines) - 1):
if lines[i].startswith("#"):
continue
ts_list.append(lines[i])
return hls_url, ts_list
def _download_by_ts_list(self, video, ts_list, m3u8):
"""
根据ts文件列表下载视频,并合并
:param cid: 专栏id
:param video: 视频信息
:param ts_list: ts文件列表
:return: 视频的文件路径
"""
# 创建专栏文件夹
path = root_path + os.sep + str(self.cid)
is_exists = os.path.exists(path)
if not is_exists:
os.makedirs(path)
# 创建专栏下的视频文件夹
path = path + os.sep + str(video["id"])
is_exists = os.path.exists(path)
if not is_exists:
os.makedirs(path)
# 根据ts列表下载ts文件
url_format = m3u8[0: m3u8.rfind("/") + 1] + "{0}"
curr = 0
for ts in ts_list:
curr += 1
filename = path + os.sep + str(curr).zfill(6) + ".ts"
is_exists = os.path.exists(filename)
if is_exists:
continue
url = url_format.format(ts)
res = requests.get(url, headers=head)
if res.status_code != 200:
print("下载ts文件失败:{0}".format(url))
continue
with open(filename, "wb") as file:
file.write(res.content)
file.close()
util.show_process(curr, len(ts_list))
# 将ts文件列表进行合并为mp4文件,并删除ts文件
# 如果是在window下
if util.is_window():
exec_str = r'copy /b "' + path + os.sep + r'*.ts" "' + path + os.sep + '{0}.mp4'.format(video["title"])
os.system(exec_str) # 使用cmd命令将资源整合
exec_str = r'del "' + path + os.sep + r'*.ts"'
os.system(exec_str) # 删除原来的文件
# 如果在linux或者mac下
else:
exec_str = "cat {0}*.ts > {1}{2}.mp4".format(path + os.sep, path + os.sep, video["title"])
os.system(exec_str) # 使用cat命令将资源整合
exec_str = "rm -rf {0}*.ts".format(path + os.sep)
os.system(exec_str) # 删除原来的文件
return path + os.sep + '{0}.mp4'.format(video["title"])
def _is_downloaded(self, column_id, video):
"""
判断视频是否已下载,防止重复下载
:param cid: 专栏id
:param video: 视频信息
:return: 是否已下载
"""
path = root_path + os.sep + str(column_id)
is_exists = os.path.exists(path)
if not is_exists:
return False
path = path + os.sep + str(video["id"])
is_exists = os.path.exists(path)
if not is_exists:
return False
path = path + os.sep + '{0}.mp4'.format(video["title"])
is_exists = os.path.exists(path)
if not is_exists:
return False
return True
def download(self):
"""
根据专栏id下载整个专栏对视频
cid的取值范围在[20002, 49999]之间
:param cid: 专栏id
:return: 是否成功
"""
if not self.before_download():
return
count = 0
for video in self.free_videos:
count += 1
if self._is_downloaded(self.cid, video):
print("第{0}个视频已下载:{1},忽略".format(count, str(video["title"])))
continue
m3u8_url, ts_list = self._get_ts_list(count, video)
while ts_list is None:
m3u8_url, ts_list = self._get_ts_list(count, video)
print("下载第{0}个视频:{1}".format(count, str(video["title"])))
self._download_by_ts_list(video, ts_list, m3u8_url)
for video in self.vip_videos:
count += 1
if self._is_downloaded(self.cid, video):
print("第{0}个视频已下载:{1},忽略".format(count, str(video["title"])))
continue
if self.is_can_pojie:
m3u8_url, ts_list = self._get_ts_list(count, video)
if ts_list is None:
print("获取视频{0}的ts列表失败".format(video["title"]))
continue
print("下载第{0}个视频:{1}".format(count, str(video["title"])))
self._download_by_ts_list(video, ts_list, m3u8_url)
else:
print("第{0}个视频收费,且不可破解:{1},忽略".format(count, str(video["title"])))
def before_download(self):
print("正在检查视频是否可以下载或者破解")
# 列出所有视频,并将其划分为免费和收费
res = self._list_video()
if type(res) == dict:
print("下载专栏{0}失败,原因:{1}".format(self.cid, res))
exit(1)
self._divide_videos(res)
self._get_is_can_pojie()
if self.is_can_pojie:
print("专栏{0}下共有{1}的视频,有{2}个可直接下载,有{3}个需要破解".
format(self.cid, len(res), len(self.free_videos), len(self.vip_videos)))
return True
else:
if len(self.free_videos) == 0:
print("专栏{0}下共有{1}的视频,全部都不可以下载或者破解".format(self.cid, len(res)))
return False
else:
print("专栏{0}下共有{1}的视频,有{2}个可下载,其余不可下载和破解".
format(self.cid, len(res), len(self.free_videos), len(self.vip_videos)))
yes_no = input('是否下载部分视频(y|n):')
if yes_no == "y" or yes_no == "Y":
return True
else:
return False
def _divide_videos(self, videos):
count = 0
for video in videos:
count += 1
obj = m3u8(video, count, self.cid)
obj.pay_for_video()
m3u8_url = obj.get_m3u8_by_pay()
if m3u8_url is not None:
self.free_videos.append(video)
self.free_m3u8_url_list.append(m3u8_url)
else:
self.vip_videos.append(video)
def _get_is_can_pojie(self):
if len(self.free_m3u8_url_list) == 0:
self.is_can_pojie = False
for u in self.free_m3u8_url_list:
if u.find("videocdn.renrenjiang.cn") < 0:
self.is_can_pojie = False
m3u8.py文件
import json
import os
import threading
import math
from time import sleep
import util
from config import *
class m3u8:
def __init__(self, video, index, cid):
self.index = index
self.video = video
self.cid = cid
self.vid = video["id"]
self.start_at = int(str(video["start_at"])[0: 6])
self.min = 0
self.max = 10000000
self.thread_num = 400
self.step = math.floor((self.max - self.min) / self.thread_num)
self.threads = []
self.success = False
self.result = None
self.lock = threading.Lock()
self.try_count = 0
self.total_count = self.max - self.min
def _func(self, a, b):
for pos in range(a, b):
if self.success:
return None
self.try_count += 1
stk_code = str(pos).zfill(7)
ss = "{0}_{1}{2}".format(self.vid, self.start_at, stk_code)
url_ff = "http://videocdn.renrenjiang.cn/Act-ss-m3u8-sd/{0}/{1}.m3u8".format(ss, ss)
try:
res = session.get(url_ff, headers=head)
if res.status_code == 200:
self.lock.acquire()
self.success = True
self.lock.release()
self.write_m3u8_to_file(url_ff)
return url_ff
except requests.exceptions.ReadTimeout:
pos -= 1
except requests.exceptions.ConnectionError:
pos -= 1
except ConnectionResetError:
pos -= 1
def get_m3u8_by_force(self):
start = time.time()
for i in range(self.thread_num):
t = threading.Thread(target=self._func, args=(self.min + self.step * i, self.min + self.step * (i + 1)))
self.threads.append(t)
t.start()
while True:
sleep(1)
util.show_process2(self.try_count, self.total_count)
for t in self.threads:
if not t.is_alive():
self.threads.remove(t)
if len(self.threads) == 0:
break
end = time.time()
print("获取到结果:{0} 总共耗时:{1}s".format(self.result, end - start))
return self.result
def pay_for_video(self):
"""
购买视频
:return: 是否成功
"""
url = "https://api.renrenjiang.cn/api/v3/activities/{0}/reservation".format(self.vid)
res = session.post(url, headers=head, data={
"type": "password",
"password": self.video["password"],
"shareId": 0
})
res = json.loads(res.content)
if "result" in res and res["result"] == "ok":
return True
else:
return False
def get_m3u8_by_pay(self):
url = "https://api.renrenjiang.cn/api/v3/activities/{0}/stream_url?user_id={1}×tamp={2}"
url = url.format(self.vid, user_id, current_milli_time())
res = session.get(url, headers=head)
res = json.loads(res.content)
if "status" in res.keys() and res["status"] == 2:
hls_url = res["hls_url"]
return hls_url
return None
def is_m3u8_exist(self):
# 创建专栏文件夹
path = root_path + os.sep + str(self.cid)
is_exists = os.path.exists(path)
if not is_exists:
os.makedirs(path)
# 创建专栏下的视频文件夹
path = path + os.sep + str(self.vid)
is_exists = os.path.exists(path)
if not is_exists:
os.makedirs(path)
path = path + os.sep + "m3u8.txt"
is_exists = os.path.exists(path)
if is_exists:
return True
return False
def read_m3u8_from_file(self):
path = root_path + os.sep + str(self.cid)
path = path + os.sep + str(self.vid)
path = path + os.sep + "m3u8.txt"
with open(path, "r") as file:
res = file.readline().replace("\n", "").replace("\r\n", "")
file.close()
return res
def write_m3u8_to_file(self, m3u8_value):
path = root_path + os.sep + str(self.cid)
path = path + os.sep + str(self.vid)
path = path + os.sep + "m3u8.txt"
with open(path, "w") as file:
file.write(m3u8_value)
file.close()
def get_m3u8(self):
if self.is_m3u8_exist():
print("第{0}个视频的m3u8已存在,直接下载".format(self.index))
return self.read_m3u8_from_file()
if self.pay_for_video():
print("第{0}个视频购买成功,直接下载".format(self.index))
hls_url = self.get_m3u8_by_pay()
self.write_m3u8_to_file(hls_url)
else:
print("第{0}个视频购买失败,正在暴力破解...".format(self.index))
return self.get_m3u8_by_force()
main.py文件
import download
if __name__ == '__main__':
cid = input('请输入人人讲的视频专栏的ID(cid): ')
print("您输入的专栏ID等于:{0}".format(cid))
obj = download.download(int(cid))
obj.download()
利用该代码,我们只需要通过专栏ID就可以下载该专栏下所有的视频啦~
3. 代码下载
在我的github上可以获取完整代码
https://github.com/15207135348/renrenjiang
最后,希望大家能够多多关注我的公众号,我会定期推送一些大数据、Java等方面的学习资料。
大数据学堂
网友评论