［转载更新］ FFMpeg的解码流程

作者: 谁与望天堂 | 来源:发表于2016-05-12 02:43 被阅读1433次

1. 从基础谈起

先给出几个概念，以在后面的分析中方便理解
Container: 在音视频中的容器，一般指的是一种特定的文件格式，里面指明了所包含的音视频，字幕等相关信息
Stream: 这个词有些微妙，很多地方都用到，比如TCP，SVR4系统等，其实在音视频，你可以理解为单纯的音频数据或者视频数据等
Frame: 这个概念不是很好明确的表示，指的是Stream中的一个数据单元，要真正对这个概念有所理解，可能需要看一些音视频编码解码的理论知识
Packet: 是Stream的raw数据
Codec: encoder + decoder
其实这些概念在在FFmpeg中都有很好的体现，我们在后续分析中会慢慢看到

2.解码的基本流程

我很懒，于是还是选择了从<An ffmpeg and SDL Tutorial>中的流程概述:

10 OPEN video_stream FROM video.avi
20 READ packet FROM video_stream INTO frame
30 IF frame NOT COMPLETE GOTO 20
40 DO SOMETHING WITH frame
50 GOTO 20

这就是解码的全过程，一眼看去，是不是感觉不过如此:),不过，事情有深有浅，从浅到深，然后从深回到浅可能才是一个有意思的过程，我们的故事，就从这里开始，展开来讲。

3.例子代码

在<An ffmpeg and SDL Tutorial 1>中，给出了一个阳春版的解码器，我们来仔细看看阳春后面的故事，为了方便讲述，我先贴出代码：

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>

#include <stdio.h>

// compatibility with newer API
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(55,28,1)
#define av_frame_alloc avcodec_alloc_frame
#define av_frame_free avcodec_free_frame
#endif

void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame) {
  FILE *pFile;
  char szFilename[32];
  int  y;
  
  // Open file
  sprintf(szFilename, "frame%d.ppm", iFrame);
  pFile=fopen(szFilename, "wb");
  if(pFile==NULL)
    return;
  
  // Write header
  fprintf(pFile, "P6\n%d %d\n255\n", width, height);
  
  // Write pixel data
  for(y=0; y<height; y++)
    fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);
  
  // Close file
  fclose(pFile);
}

int main(int argc, char *argv[]) {
  // Initalizing these to NULL prevents segfaults!
  AVFormatContext   *pFormatCtx = NULL;
  int               i, videoStream;
  AVCodecContext    *pCodecCtxOrig = NULL;
  AVCodecContext    *pCodecCtx = NULL;
  AVCodec           *pCodec = NULL;
  AVFrame           *pFrame = NULL;
  AVFrame           *pFrameRGB = NULL;
  AVPacket          packet;
  int               frameFinished;
  int               numBytes;
  uint8_t           *buffer = NULL;
  struct SwsContext *sws_ctx = NULL;

  if(argc < 2) {
    printf("Please provide a movie file\n");
    return -1;
  }
  // [1] Register all formats and codecs
  av_register_all();
  
  // [2] Open video file
  if(avformat_open_input(&pFormatCtx, argv[1], NULL, NULL)!=0)
    return -1; // Couldn't open file
  
  // [3] Retrieve stream information
  if(avformat_find_stream_info(pFormatCtx, NULL)<0)
    return -1; // Couldn't find stream information
  
  // Dump information about file onto standard error
  av_dump_format(pFormatCtx, 0, argv[1], 0);
  
  // Find the first video stream
  videoStream=-1;
  for(i=0; i<pFormatCtx->nb_streams; i++)
    if(pFormatCtx->streams[i]->codec->codec_type==AVMEDIA_TYPE_VIDEO) {
      videoStream=i;
      break;
    }
  if(videoStream==-1)
    return -1; // Didn't find a video stream
  
  // Get a pointer to the codec context for the video stream
  pCodecCtxOrig=pFormatCtx->streams[videoStream]->codec;
  // Find the decoder for the video stream
  pCodec=avcodec_find_decoder(pCodecCtxOrig->codec_id);
  if(pCodec==NULL) {
    fprintf(stderr, "Unsupported codec!\n");
    return -1; // Codec not found
  }
  // Copy context
  pCodecCtx = avcodec_alloc_context3(pCodec);
  if(avcodec_copy_context(pCodecCtx, pCodecCtxOrig) != 0) {
    fprintf(stderr, "Couldn't copy codec context");
    return -1; // Error copying codec context
  }

  // Open codec
  if(avcodec_open2(pCodecCtx, pCodec, NULL)<0)
    return -1; // Could not open codec
  
  // Allocate video frame
  pFrame=av_frame_alloc();
  
  // Allocate an AVFrame structure
  pFrameRGB=av_frame_alloc();
  if(pFrameRGB==NULL)
    return -1;

  // Determine required buffer size and allocate buffer
  numBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
                  pCodecCtx->height);
  buffer=(uint8_t *)av_malloc(numBytes*sizeof(uint8_t));
  
  // Assign appropriate parts of buffer to image planes in pFrameRGB
  // Note that pFrameRGB is an AVFrame, but AVFrame is a superset
  // of AVPicture
  avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24,
         pCodecCtx->width, pCodecCtx->height);
  
  // initialize SWS context for software scaling
  sws_ctx = sws_getContext(pCodecCtx->width,
               pCodecCtx->height,
               pCodecCtx->pix_fmt,
               pCodecCtx->width,
               pCodecCtx->height,
               PIX_FMT_RGB24,
               SWS_BILINEAR,
               NULL,
               NULL,
               NULL
               );

  // [4] Read frames and save first five frames to disk
  i=0;
  while(av_read_frame(pFormatCtx, &packet)>=0) {
    // Is this a packet from the video stream?
    if(packet.stream_index==videoStream) {
      // Decode video frame
      avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished, &packet);
      
      // Did we get a video frame?
      if(frameFinished) {
    // Convert the image from its native format to RGB
    sws_scale(sws_ctx, (uint8_t const * const *)pFrame->data,
          pFrame->linesize, 0, pCodecCtx->height,
          pFrameRGB->data, pFrameRGB->linesize);
    
    // Save the frame to disk
    if(++i<=5)
      SaveFrame(pFrameRGB, pCodecCtx->width, pCodecCtx->height, 
            i);
      }
    }
    
    // Free the packet that was allocated by av_read_frame
    av_free_packet(&packet);
  }
  
  // Free the RGB image
  av_free(buffer);
  av_frame_free(&pFrameRGB);
  
  // Free the YUV frame
  av_frame_free(&pFrame);
  
  // Close the codecs
  avcodec_close(pCodecCtx);
  avcodec_close(pCodecCtxOrig);

  // Close the video file
  avformat_close_input(&pFormatCtx);
  
  return 0;
}

代码注释得很清楚，没什么过多需要讲解的，关于其中的什么YUV420，RGB，PPM等格式，如果不理解，麻烦还是google一下，也可以参考:http://barrypopy.cublog.cn/里面的相关文章其实这部分代码，很好了Demo了怎么样去抓屏功能的实现，但我们得去看看魔术师在后台的一些手法，而不只是简单的享受其表演。

4.背后的故事

真正的难度，其实就是上面的[1],[2],[3],[4],其他部分，都是数据结构之间的转换，如果你认真看代码的话，不难理解其他部分。

[1]：av_register_all

注册所有容器与codec

[2]：avformat_open_input

先说说里面的AVFormatContext *pFormatCtx结构，字面意思理解AVFormatContext就是关于AVFormat(其实就是我们上面说的Container格式)的所处的Context(场景)，自然是保存Container信息的总控结构了，后面你也可以看到，基本上所有的信息，都可以从它出发而获取到
我们来看看avformat_open_input()都做了些什么：

Paste_Image.png

这样看来，只是做了两件事情：

1). 侦测容器文件格式

实际上就是探测确定demuxer
av_probe_input_format3从first_iformat开始遍历注册的所有demuxer，以mkv为例：

AVInputFormat ff_matroska_demuxer = { 
  .name = "matroska,webm", 
  .long_name = NULL_IF_CONFIG_SMALL("Matroska / WebM"), 
  .extensions = "mkv,mk3d,mka,mks", 
  .priv_data_size = sizeof(MatroskaDemuxContext), 
  .read_probe = matroska_probe, 
  .read_header = matroska_read_header, 
  .read_packet = matroska_read_packet, 
  .read_close = matroska_read_close, 
  .read_seek = matroska_read_seek, 
  .mime_type = "audio/webm,audio/x-matroska,video/webm,video/x-matroska"
};

遍历调用相应的read_probe函数，最终确定容器格式( AVFormatContext的iformat )：

typedef struct AVFormatContext { 
......
/** 
* The input container format. 
* 
* Demuxing only, set by avformat_open_input(). 
*/ 
struct AVInputFormat *iformat;
......
}

2). 从容器文件获取Stream的信息

其实就是使用确定了的demuxer的方法分离出所有Stream的过程:
av_open_input_stream调用已确定demuxer的read_header函数以获取所有stream信息（AVFormatContext的streams）：

/** 
* Number of elements in AVFormatContext.streams. 
* 
* Set by avformat_new_stream(), must not be modified by any other code. 
*/
unsigned int nb_streams;
/** 
* A list of all streams in the file. New streams are created with 
* avformat_new_stream(). 
* 
* - demuxing: streams are created by libavformat in avformat_open_input(). 
*             If AVFMTCTX_NOHEADER is set in ctx_flags, then new streams may also 
*             appear in av_read_frame(). 
* - muxing: streams are created by the user before avformat_write_header(). 
* 
* Freed by libavformat in avformat_free_context(). 
*/
AVStream **streams;

[3]: avformat_find_stream_info

进一步解析Stream的信息，比如根据上一步确定的enum AVCodecID codec_id，确定对应的const struct AVCodec *codec

[4]: av_read_frame, avcodec_decode_video2

先简单说一些ffmpeg方面的东西，从理论角度说过来，Packet可以包含frame的部分数据，但ffmpeg为了实现上的方便，使得对于视频来说，每个Packet至少包含一frame,对于音频也是相应处理，这是实现方面的考虑，而非协议要求.因此，在上面的代码实际上是这样的：从文件中读取packet，从Packet中解码相应的frame; 从帧中解码; if(解码帧完成) do something();
我们来看看如何获取Packet,又如何从Packet中解码frame的。

av_read_frame 
---> av_read_frame_internal  
---> ff_read_packet  
---> (AVInputFormat *) iformat->read_packet

  avcodec_decode_video2 
---> avctx->codec->decode  (调用指定Codec的解码函数)

因此，从上面的过程可以看到，实际上分为了两部分：
一部分是解复用(demuxer):av_read_frame();
然后是解码(decode): avcodec_decode_video2()

5.后面该做些什么

结合这部分和转贴的ffmepg框架的文章，应该可以基本打通解码的流程了，后面的问题则是针对具体容器格式和具体编码解码器的分析，后面我们继续参考：
[1]. <An ffmpeg and SDL Tutorial>
http://dranger.com/ffmpeg/tutorial01.html
[2]. <FFMpeg框架代码阅读>
http://blog.csdn.net/wstarx/archive/2007/04/20/1572393.aspx

［转载更新］ FFMpeg的解码流程

1. 从基础谈起

2.解码的基本流程

3.例子代码

4.背后的故事

[1]：av_register_all

[2]：avformat_open_input

1). 侦测容器文件格式

2). 从容器文件获取Stream的信息

[3]: avformat_find_stream_info

[4]: av_read_frame, avcodec_decode_video2

5.后面该做些什么

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

ijkplayer秘籍

FFmpeg精华技术干货