Android平台使用MediaCodec进行H264格式的视频编解码
Android多媒体简介
MediaCodec
MediaExtractor
MediaMux
H264关键词
FAQ
screenrecord
Markdown 是一种轻量级标记语言,它允许人们使用易读易写的纯文本格式编写文档,然后转换成格式丰富的HTML页面。 ——[ 维基百科 ]
提起android中的多媒体,就不得不提两个类,那就是MediaPlayer和MediaRecorder,这两个是分别用来进行音视频播放和录制的类。这两个类都可以在Java中调用的到,他们都最终访问到android系统提供的多媒体服务。
android中的多媒体服务是有mediaplayerservice来提供的,它是一个本地服务,也就是常说到的native service。由C++语言实现,服务的业务主体是由C++类MediaPlayerService来实现,而该类是在服务进程mediaserver中调用的。
frameworks/av/media/mediaserver
以上是mediaserver的实现源码目录所在,主体就是一个简单的main函数,在此不再详细解读,感兴趣的读者可以自行研究。
MediaPlayerService是android多媒体框架的核心,其中涉及的很多类和复杂的处理,它运行在一个进程中,那么其他进程如何来获取它提供的功能的,这就涉及到android里面的又一个重量级人物,那就是binder。binder是android中用来进行进程间通信的,作为一种IPC手段来使用。除了binder,android也使用了一些其他的IPC手段,比如说UNIX socket等。为什么android大部分的主体实现都是用binder做为IPC手段,也许是效率,也许是其他原因,最终才促使google做了这个决定。本篇主要介绍多媒体,binder内容不做深入研究。
上面提到MediaPlayerService运行于服务进程中,那么客户进程如何来访问到他提供的功能呢,就是利用binder,binder由Bn端和Bp端,Bn端一般作为服务端,Bp端作为客户端,由此就把进程的概念模糊化了,上层开发者不必深入研究中间的通信过程,只知道利用Bp端去使用Bn端的功能就可以了。
那么接下来就要讲到MediaPlayer了,它在C++层有具体实现,其中就会封装到一个BpMediaPlayer,而MediaPlayerService中封装了一个BnMediaPlayer,由此两者实现了通信。注意:此处并不是通过BpMediaPlayerService来进行通信。为什么不这么做呢,也许是因为MeidaPlayerService承担的任务太重了,内部把录制和播放两个实例分开,模块化了,它不仅仅与MediaPlayer通信,也与Media Recorder通信,所以不要被它的名字所迷惑。
与MediaPlayer类似,C++层也实现了一个MediaRecorder类。有了这两个类的native实现,那么java层调用的MediaPlayer和MediaRecorder就都在native端对应上了,此时只需要通过jni把java和C++的实现对应起来,我们就可以在上层使用MediaPlayer和MediaRecorder了。
通过上一节的介绍,我们应该对android平台的多媒体框架有一个大概的认识了。
我们知道,可以利用MediaPlayer和MediaRecorder来使用android实现的多媒体服务功能。但是这两个类并不是万能的。
MediaPlayer可以用来播放多媒体文件,但是它却无法操作视频解码的细节,比如采用哪种解码器,软解还是硬解,我们无法进行控制。只能按照平台上的实现来进行播放。
MediaRecorder的局限性就更大了,它只能录制camera的源(也许不正确,本人做系统开发,不太了解上层的一些接口),至少目前我遇到的都是把camera作为source的录制源的。那么我如果想要调用编码器编码一段原始的yuv420格式的视频源,就不能使用该接口了。
由此MediaCodec类应运而生,它是android平台上操作编解码器的一个类,同样实现了java层和C++层。在C++层的实现中,是通过一个OMXClinet来与Bn端的OMX来进行通信的,OMX服务也是在MediaPlayerService中实现的。OMX是对OMXMaster类的一个封装,它是负责对平台上的所有编解码器进行管理的一个类,包括软解和硬解等。
这样通过MediaCodec类,我们就可以直接操作android上的软硬件编解码器。这种操作更加的简单粗暴,对程序员的要求当然会更高。
对多媒体编解码比较熟悉的人都会知道,文件的格式并不代表编码格式,其实,文件的不同格式只是相当于对一种容器的区分,这种容器中装的就是通过编码器编码后的音频数据和视频数据。
MediaExtractor的作用就是把容器中音频数据和视频数据分开,分开后才能针对音频和视频进行不同的解码。
比如一个.mp4文件中,视频数据是按照H264格式编码的,音频是按照AAC格式编码的。那么通过MediaExtractor进行分流以后,就可以把视频数据送给H264解码器,而把音频数据送给AAC解码器。MediaExtractor就是起这么一个分流作用的。
与MediaExtractor相反,MediaMuxer是用在把不同的音频数据和视频数据组合在一起的类,java层和native c++都有实现。相当于是一个封装器,主要在编码流程中使用,比如在编码的最后通过MediaMuxer组合封装成一个MP4文件。
CSD: Codec-Specific Data,Codec特定数据,是一坨原始数据,包含诸如 Sequence Parameter Set 和Picture Parameter Set之类的数据。它是由MediaCodec编码器生成,并且MediaCodec解码器在解码时一定需要的数据。从编码器读数据时BUFFER_FLAG_CODEC_CONFIG就标志着CSD data的到来。在解码时必须要把此数据首先传给解码器,它会做一些初始配置工作。
SPS:Sequence Parameter Set 序列参数集,H.264码流第一个 NALU
PPS:Picture Parameter Set图像参数集,H.264码流第二个 NALU
IDR帧:IDR帧属于I 帧。解码器收到IDR frame时,将所有的参考帧队列丢弃 ,这点是所有I 帧共有的特性,但是收到IDR帧时,解码器另外需要做的工作就是:把所有的PPS和SPS参数进行更新。由此可见,在编码器端,每发一个 IDR,就相应地发一个 PPS&SPS_nal_unit
I帧:帧内编码帧是一种自带全部信息的独立帧,无需参考其它图像便可独立进行解码,视频序列中的第一个帧始终都是I帧。
P帧:前向预测编码帧
B帧:双向预测内插编码帧
FAQ
Q1. How do I play the video streams created by MediaCodec with the “video/avc” codec?
A1. The stream created is a raw H.264 elementary stream. The Totem Movie Player for Linux may work, but many other players won’t touch them. You need to use the MediaMuxer class to create an MP4 file instead. See the EncodeAndMuxTest sample.
Q2. Why does my call to MediaCodec.configure() fail with an IllegalStateException when I try to create an encoder?
A2. This is usually because you haven’t specified all of the mandatory keys required by the encoder. See this stackoverflow item for an example.
Q3. My video decoder is configured but won’t accept data. What’s wrong?
A3. A common mistake is neglecting to set the Codec-Specific Data, mentioned briefly in the documentation, through the keys “csd-0” and “csd-1”. This is a bunch of raw data with things like Sequence Parameter Set and Picture Parameter Set; all you usually need to know is that the MediaCodec encoder generates them and the MediaCodec decoder wants them.
If you are feeding the output of the encoder to the decoder, you will note that the first packet you get from the encoder has the BUFFER_FLAG_CODEC_CONFIG flag set. You need to make sure you propagate this flag to the decoder, so that the first buffer the decoder receives does the setup. Alternatively, you can set the CSD data in the MediaFormat, and pass this into the decoder via configure(). You can see examples of both approaches in the EncodeDecodeTest sample.
If you’re not sure how to set this up, you should probably be using MediaExtractor, which will handle it all for you.
Q4. Can I stream data into the decoder?
A4. Yes and no. The decoder takes a stream of “access units”, which may not be a stream of bytes. For the video decoder, this means you need to preserve the “packet boundaries” established by the encoder (e.g. NAL units for H.264 video). For example, see how the VideoChunks class in the DecodeEditEncodeTest sample operates. You can’t just read arbitrary chunks of the file and pass them in.
Q5. I’m encoding the output of the camera through a YUV preview buffer. Why do the colors look wrong?
A5. The color formats for the camera output and the MediaCodec encoder input are different. Camera supports YV12 (planar YUV 4:2:0) and NV21 (semi-planar YUV 4:2:0). The MediaCodec encoders support one or more of:
•#19 COLOR_FormatYUV420Planar (I420)
•#20 COLOR_FormatYUV420PackedPlanar (also I420)
•#21 COLOR_FormatYUV420SemiPlanar (NV12)
•#39 COLOR_FormatYUV420PackedSemiPlanar (also NV12)
•#0x7f000100 COLOR_TI_FormatYUV420PackedSemiPlanar (also also NV12)
I420 has the same general data layout as YV12, but the Cr and Cb planes are reversed. Same with NV12 vs. NV21. So if you try to hand YV12 buffers from the camera to an encoder expecting something else, you’ll see some odd color effects, like in these images.
As of Android 4.4 (API 19), there is still no common input format. Nvidia Tegra 3 devices like the Nexus 7 (2012), and Samsung Exynos devices like the Nexus 10, want COLOR_FormatYUV420Planar. Qualcomm Adreno devices like the Nexus 4, Nexus 5, and Nexus 7 (2013) want COLOR_FormatYUV420SemiPlanar. TI OMAP devices like the Galaxy Nexus want COLOR_TI_FormatYUV420PackedSemiPlanar. (This is based on the format that is returned first when the AVC codec is queried.)
A more portable, and more efficient, approach is to use the API 18 Surface input API, demonstrated in the CameraToMpegTest sample. The down side of this is that you have to operate in RGB rather than YUV, which is a problem for image processing software. If you can implement the image manipulation in a fragment shader, perhaps by converting between RGB and YUV before and after your computations, you can take advantage of code execution on the GPU.
Note that the MediaCodec decoders may produce data in ByteBuffers using one of the above formats or in a proprietary format. For example, devices based on Qualcomm SoCs commonly use OMX_QCOM_COLOR_FormatYUV420PackedSemiPlanar32m (#2141391876 / 0x7FA30C04).
Surface input uses COLOR_FormatSurface, also known as OMX_COLOR_FormatAndroidOpaque (#2130708361 / 0x7F000789). For the full list, see OMX_COLOR_FORMATTYPE in OMX_IVCommon.h.
Q6. What’s this EGL_RECORDABLE_ANDROID flag?
A6. That tells EGL that the surface it creates must be compatible with the video codecs. Without this flag, EGL might use a buffer format that MediaCodec can’t understand.
Q7. Can I use the ImageReader class with MediaCodec?
A7. No. The ImageReader class, added in Android 4.4 (API 19), provides a handy way to access data in a YUV surface. Unfortunately, as of API 19 it only works with buffers from Camera. Also, there is no corresponding ImageWriter class for creating content.
Q8. Do I have to set a presentation time stamp when encoding video?
A8. Yes. It appears that some devices will drop frames or encode them at low quality if the presentation time stamp isn’t set to a reasonable value (see this stackoverflow item).
Remember that the time required by MediaCodec is in microseconds. Most timestamps passed around in Java code are in milliseconds or nanoseconds.
Q9. Most of the examples require API 18. I’m coding for API 16. Is there something I should know?
A9. Yes. Some key features aren’t available until API 18, and some basic features are more difficult to use in API 16.
If you’re decoding video, things don’t change much. As you can see from the two implementations of ExtractMpegFramesTest, the newer version of EGL isn’t available, but for many applications that won’t matter.
If you’re encoding video, things are much worse. Three key points:
1.The MediaCodec encoders don’t accept input from a Surface, so you have to provide the data as raw YUV frames.
2.The layout of the YUV frames varies from device to device, and in some cases you have to check for specific vendors by name to handle certain qirks.
3.Some devices may not advertise support for any usable YUV formats (i.e. they’re internal-use only).
4.The MediaMuxer class doesn’t exist, so there’s no way to convert the H.264 stream to something that MediaPlayer (or many desktop players) will accept. You have to use a 3rd-party library (perhaps mp4parser).
5.When the MediaMuxer class was introduced in API 18, the behavior of MediaCodec encoders was changed to emit INFO_OUTPUT_FORMAT_CHANGED at the start, so that you have a convenient MediaFormat to feed to the muxer. On older versions of Android, this does not happen.
This stackoverflow item has additional links and commentary.
The CTS tests for MediaCodec were introduced with API 18 (Android 4.3), which in practice means that’s the first release where the basic features are likely to work consistently across devices. In particular, pre-4.3 devices have been known to drop the last frame or scramble PTS values when decoding.
Q10. Can I use MediaCodec in the AOSP emulator?
A10. Maybe. The emulator provides a software AVC codec that lacks certain features, notably input from a Surface (although it appears that this may now be fixed in Android 5.0 “Lollipop”). Developing on a physical device will likely be less frustrating.
Q11. Why is the output messed up (all zeroes, too short, etc)?
A11. The most common mistake is failing to adjust the ByteBuffer position and limit values. As of API 19, MediaCodec does not do this for you.
You need to do something like:
int bufIndex = codec.dequeueOutputBuffer(info, TIMEOUT);
ByteBuffer outputData = outputBuffers[bufIndex];
if (info.size != 0) {
outputData.position(info.offset);
outputData.limit(info.offset + info.size);
}
On the input side, you want to call clear() on the buffer before copying data into it.
Q12. Why am I seeing storeMetaDataInBuffers failures in the log?
A12. They look like this (example from a Nexus 5):
E OMXNodeInstance: OMX_SetParameter() failed for StoreMetaDataInBuffers: 0x8000101a
E ACodec : [OMX.qcom.video.encoder.avc] storeMetaDataInBuffers (output) failed w/ err -2147483648
You can ignore them, they’re harmless.
网友评论