iOS 音视频(三) - 视频编码-实现H264编解码

作者: 顶级蜗牛 | 来源:发表于2024-06-02 13:46 被阅读0次

文章结掌握内容：
1.了解VideoToolBox
2.H264编码
3.H264解码

一、了解VideoToolBox(硬编码)

在iOS4.0苹果就已经支持硬编解码.但是硬编解码在当时属于私有API，不提供给开发者使用。在2014年的WWDC大会上(iOS 8.0 之后)，苹果开放了硬编解码的API(VideoToolbox.framework)。

VideoToolbox.framework是基于Core Foundation、CoreMedia、CoreVideo库函数的C语言的API。它提供三种会话session：编码、解码、像素移动。

VideoToolbox从CoreMedia、CoreVideo库衍生出了关于时间和帧管理的数据类型，比如CMTime、CVPixelBuffer(解码后的数据)、CMFormatDescription(视频格式描述)

VideoToolBox是可以直接访问硬件编码器和解码器，它存在于视频压缩和解压缩以及存储在像素缓存区中的数据转换提供服务。

硬编码的优点：
1.提高编码性能(使用CPU的使用率大大降低,倾向使用GPU)
2.增加编码效率(将编码一帧的时间缩短)
3.延长电量使用(耗电量大大降低)

编码的输入和输出

左边的三帧视频帧是发送給编码器之前的数据，开发者必须将原始图像数据封装为CVPixelBuffer的数据结构，该数据结构是使用VideoToolBox的核心。

二、H264编码

编码流程：
1.创建session
2.设置编码相关参数
3.获取采集数据，开始编码
4.获取编码后数据

编码前，看看配置信息

import UIKit
import VideoToolbox
class VideoH264Encoder: NSObject {
    
    struct Config {
        var width: Int32 = 480 // 采集分辨率的width
        var height:Int32 = 640 // 采集分辨率的height
        var bitRate : Int32 = 480 * 640 * 3 * 4 // 码率
        var fps: Int32 = 30 // fps
    }
    
    var config: VideoH264Encoder.Config // 配置信息：宽/高/码率/fps
    
    private var hasSpsPps = false // 是否有sps/pps
    private var frameID: Int64 = 0
    private var encodeSession: VTCompressionSession! // session
    private var encodeCallBack: VTCompressionOutputCallback? // 系统完成编码后的回调，拿到编码后的数据
    private var encodeQueue = DispatchQueue(label: "encode") // 开始编码队列
    private var callBackQueue = DispatchQueue(label: "callBack") // 编码后数据返回出去的队列
    
    // 回调出去sps/pps
    var videoEncodeCallbackSPSAndPPS :((Data,Data)->Void)?
    func videoEncodeCallbackSPSAndPPS(block:@escaping (Data,Data)->Void) {
        videoEncodeCallbackSPSAndPPS = block
    }
    // 回调出去NALU块
    var videoEncodeCallback : ((Data)-> Void)?
    func videoEncodeCallback(block:@escaping (Data)-> Void){
        self.videoEncodeCallback = block
    }
    
    // 初始化
    init(config: VideoH264Encoder.Config = VideoH264Encoder.Config()) {
        self.config = config
        super.init()
        
        setCallBack() // 设置VTCompressionOutputCallback,能拿到系统完成编码后的数据
        initVideoToolBox() // 创建编码器
    }
}

1.创建session，使用VTCompressionSessionCreate函数

    //初始化编码器
    private func initVideoToolBox() {
        //创建VTCompressionSession OC的调用方式
        // OSStatus status = VTCompressionSessionCreate(NULL, (int32_t)(self.config.width), (int32_t)self.config.height, kCMVideoCodecType_H264, NULL, NULL, NULL, vtCompressionSessionCallback, (__bridge void * _Nullable)(self), &_vEnSession);
        /**
         * allocator: 分配器
         * width: 分辨率的width，像素为单位的，倘若该值非法，编码会自动改为合法的值
         * height: 分辨率的height。。。
         * codecType: 编码类型
         * encoderSpecification: 编码规范
         * imageBufferAttributes: 源像素缓冲区，nil就会由VideoToolBox来创建
         * compressedDataAllocator：压缩数据分配器
         * outputCallback：编码完成后的回调
         * refcon：把self桥接过去，因为对方是C语言
         * compressionSessionOut：session
         */
        let state = VTCompressionSessionCreate(allocator: kCFAllocatorDefault,
                                               width: config.width,
                                               height: config.height,
                                               codecType: kCMVideoCodecType_H264,
                                               encoderSpecification: nil,
                                               imageBufferAttributes: nil,
                                               compressedDataAllocator: nil,
                                               outputCallback:encodeCallBack ,
                                               refcon: unsafeBitCast(self, to: UnsafeMutableRawPointer.self),
                                               compressionSessionOut: &self.encodeSession)
        // noErr 等价于 0
        if state != noErr { print("creat VTCompressionSession failed"); return }
......

2.设置编码相关参数，使用VTSessionSetProperty函数

/*
    session: 会话
    propertyKey: 属性名称
    propertyValue: 属性值
*/
VT_EXPORT OSStatus 
VTSessionSetProperty(
  CM_NONNULL VTSessionRef       session,
  CM_NONNULL CFStringRef        propertyKey,
  CM_NULLABLE CFTypeRef         propertyValue ) API_AVAILABLE(macosx(10.8), ios(8.0), tvos(10.2));

......
        //设置实时编码输出
        VTSessionSetProperty(encodeSession, key: kVTCompressionPropertyKey_RealTime, value: kCFBooleanTrue)
        //设置编码方式
        //ProfileLevel，h264的协议等级，不同的清晰度使用不同的ProfileLevel
        VTSessionSetProperty(encodeSession, key: kVTCompressionPropertyKey_ProfileLevel, value: kVTProfileLevel_H264_Baseline_AutoLevel)
        //设置是否产生B帧(因为B帧在解码时并不是必要的,是可以抛弃B帧的) - 泛娱乐直播可以使用B帧（允许等待，压缩率更高）；实时性直播不可以使用B帧
        VTSessionSetProperty(encodeSession, key: kVTCompressionPropertyKey_AllowFrameReordering, value: kCFBooleanFalse)
        //设置关键帧间隔 GOP （如果设置太小，视频会模糊，如果设置太大，体量就会增大）
        var frameInterval = 30
        let number = CFNumberCreate(kCFAllocatorDefault, CFNumberType.intType, &frameInterval)
        VTSessionSetProperty(encodeSession, key: kVTCompressionPropertyKey_MaxKeyFrameInterval, value: number)
        
        //设置期望帧率，不是实际帧率 （帧率上限）
        let fpscf = CFNumberCreate(kCFAllocatorDefault, CFNumberType.intType, &config.fps)
        VTSessionSetProperty(encodeSession, key: kVTCompressionPropertyKey_ExpectedFrameRate, value: fpscf)
        
        //设置码率平均值，单位是bps。码率大了话就会非常清晰，但同时文件也会比较大。码率小的话，图像有时会模糊，但也勉强能看
        //码率计算公式参考笔记
        // var bitrate = width * height * 3 * 4
        let bitrateAverage = CFNumberCreate(kCFAllocatorDefault, CFNumberType.intType, &config.bitRate)
        VTSessionSetProperty(encodeSession, key: kVTCompressionPropertyKey_AverageBitRate, value: bitrateAverage)
        
        //码率限制
        let bitRatesLimit :CFArray = [config.bitRate * 2, 1] as CFArray
        VTSessionSetProperty(encodeSession, key: kVTCompressionPropertyKey_DataRateLimits, value: bitRatesLimit)
        
        // 创建session和设置属性之后，准备开始编码
        VTCompressionSessionPrepareToEncodeFrames(encodeSession)
    }

kVTCompressionPropertyKey_RealTime:设置是否实时编码
kVTProfileLevel_H264_Baseline_AutoLevel:表示使用H264的Profile规格,可以设置Hight的AutoLevel规格.
kVTCompressionPropertyKey_AllowFrameReordering:表示是否使用产生B帧数据(因为B帧在解码是非必要数据,所以开发者可以抛弃B帧数据)
kVTCompressionPropertyKey_MaxKeyFrameInterval: 表示关键帧的间隔,也就是我们常说的gop size.
kVTCompressionPropertyKey_ExpectedFrameRate: 表示设置帧率
kVTCompressionPropertyKey_AverageBitRate/kVTCompressionPropertyKey_DataRateLimits: 设置编码输出的码率.

3.获取采集数据，开始编码

通过AVFoundation捕获得视频的数据，会通过AVCaptureVideoDataOutputSampleBufferDelegate代理方法的回调给到开发者：

// ps：注意这个代理方法也可能捕获到音频和视频的数据，需要做区分的
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {  
    guard CMSampleBufferDataIsReady(sampleBuffer) else {
        print("data not ready")
        return
    }
    videoEncodeQueue.sync { [weak self] in
        guard let weakSelf = self else {return}
        weakSelf.videoEncoder.encodeVideo(sampleBuffer: sampleBuffer)
    }
}

我们来看看encodeVideo做了什么：

    //开始编码
    func encodeVideo(sampleBuffer:CMSampleBuffer){
        if self.encodeSession == nil {
            initVideoToolBox()
        }
        encodeQueue.async { [weak self] in
            guard let weakSelf = self else { return }
            guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
            weakSelf.encodeVideo(imageBuffer: imageBuffer)
        }
        
    }
    func encodeVideo(imageBuffer: CVPixelBuffer) {
        let time = CMTime(value: self.frameID, timescale: 1000)
        /**
         * imageBuffer: 未编码的数据
         * presentationTimeStamp：时间戳
         * duration：帧展示时间，如果没有就用CMTime.invalid
         * frameProperties：帧属性 nil
         * sourceFrameRefcon：编码过程中的回调
         * infoFlagsOut：flags 同步/异步
         */
        let state = VTCompressionSessionEncodeFrame(self.encodeSession,
                                                    imageBuffer: imageBuffer,
                                                    presentationTimeStamp: time,
                                                    duration: .invalid,
                                                    frameProperties: nil,
                                                    sourceFrameRefcon: nil,
                                                    infoFlagsOut: nil)
        if state != noErr {
            print("encode filure")
        }
    }

4.获取编码后数据

当编码成功后,就会回调到最开始初始化编码器会话时传入的回调函数encodeCallBack，我们来看看它是如何声明的:

    private func setCallBack()  {
        //编码完成回调
        encodeCallBack = {(outputCallbackRefCon, sourceFrameRefCon, status, flag, sampleBuffer)  in
            // 没有错误
            if status != noErr { return }
            // 数据准备好
            guard let sampleBuffer = sampleBuffer, CMSampleBufferDataIsReady(sampleBuffer) else {return}
            //outputCallbackRefCon?.bindMemory(to: DQVideoEncoder.self, capacity: 1)
//            let encodepointer = outputCallbackRefCon?.assumingMemoryBound(to: DQVideoEncoder.self)
//            let encodepointer1 = outputCallbackRefCon?.bindMemory(to: DQVideoEncoder.self, capacity: 1)
//            let encoder1 = encodepointer1?.pointee
            // outputCallbackRefCon其实就是传过来的 self，因为是C的对象通过内存对齐得到自己的
            let encoder :VideoH264Encoder = unsafeBitCast(outputCallbackRefCon, to: VideoH264Encoder.self)
            
            /// 0. 原始字节数据 8字节
            let buffer : [UInt8] = [0x00,0x00,0x00,0x01]
            /// 1. [UInt8] -> UnsafeBufferPointer<UInt8>
            let unsafeBufferPointer = buffer.withUnsafeBufferPointer {$0}
            /// 2.. UnsafeBufferPointer<UInt8> -> UnsafePointer<UInt8>
            let  unsafePointer = unsafeBufferPointer.baseAddress
            guard let startCode = unsafePointer else {return}
            
            // 获取是否是关键帧
            let attachArray = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, createIfNecessary: false)
            let strkey = unsafeBitCast(kCMSampleAttachmentKey_NotSync, to: UnsafeRawPointer.self)
            let cfDic = unsafeBitCast(CFArrayGetValueAtIndex(attachArray, 0), to: CFDictionary.self)
            let keyFrame = !CFDictionaryContainsKey(cfDic, strkey);//没有这个键就意味着同步,就是关键帧I帧
            
            //  获取sps pps
            if keyFrame && !encoder.hasSpsPps{
                if let description = CMSampleBufferGetFormatDescription(sampleBuffer) { //description源图像编码相关信息
                    var spsSize: Int = 0, spsCount :Int = 0, spsHeaderLength:Int32 = 0
                    var ppsSize: Int = 0, ppsCount: Int = 0, ppsHeaderLength:Int32 = 0
                    //var spsData:UInt8 = 0, ppsData:UInt8 = 0
                    
                    var spsDataPointer : UnsafePointer<UInt8>? = UnsafePointer(UnsafeMutablePointer<UInt8>.allocate(capacity: 0))
                    var ppsDataPointer : UnsafePointer<UInt8>? = UnsafePointer<UInt8>(bitPattern: 0)
                    // 获取sps的data/size/count/HeaderLength
                    let spsstatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description,
                                                                                       parameterSetIndex: 0, parameterSetPointerOut: &spsDataPointer, parameterSetSizeOut: &spsSize, parameterSetCountOut: &spsCount, nalUnitHeaderLengthOut: &spsHeaderLength)
                    if spsstatus != noErr {
                        print("sps失败")
                    }
                    // 获取pps的data/size/count/HeaderLength
                    let ppsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description,
                                                                                       parameterSetIndex: 1, parameterSetPointerOut: &ppsDataPointer, parameterSetSizeOut: &ppsSize, parameterSetCountOut: &ppsCount, nalUnitHeaderLengthOut: &ppsHeaderLength)
                    if ppsStatus != noErr {
                        print("pps失败")
                    }
                    
                    if let spsData = spsDataPointer,let ppsData = ppsDataPointer{
                        var spsDataValue = Data(capacity: 4 + spsSize)
                        spsDataValue.append(buffer, count: 4)
                        spsDataValue.append(spsData, count: spsSize)
                        
                        var ppsDataValue = Data(capacity: 4 + ppsSize)
                        ppsDataValue.append(startCode, count: 4)
                        ppsDataValue.append(ppsData, count: ppsSize)
                        encoder.callBackQueue.async {
                            encoder.videoEncodeCallbackSPSAndPPS!(spsDataValue, ppsDataValue)
                        }
                    }
                }
            }
            
            // 获取编码后的H264数据块 NALU
            let dataBuffer: CMBlockBuffer? = CMSampleBufferGetDataBuffer(sampleBuffer)
//            var arr = [Int8]()
//            let pointer = arr.withUnsafeMutableBufferPointer({$0})
            var dataPointer: UnsafeMutablePointer<Int8>?  = nil
            var totalLength :Int = 0
            /**
             获取数据块的首地址
             atOffset：指针偏移量
             lengthAtOffsetOut: 单个数据块的长度
             totalLengthOut：数据块之和的总长度
             dataPointerOut：首地址
             （可理解成数组...）
             */
            let blockState = CMBlockBufferGetDataPointer(dataBuffer!,
                                                         atOffset: 0,
                                                         lengthAtOffsetOut: nil,
                                                         totalLengthOut: &totalLength,
                                                         dataPointerOut: &dataPointer)
            if blockState != noErr {
                print("获取data失败\(blockState)")
            } else {
                //NALU
                var offset :UInt32 = 0
                //返回的nalu数据前四个字节不是0001的startcode(不是系统端的0001)，而是大端模式的帧长度length
                let lengthInfoSize = 4
                //循环写入nalu数据（循环读取全部的数据块）
                while offset < totalLength - lengthInfoSize {
                    //获取nalu 数据长度
                    var naluDataLength:UInt32 = 0
                    memcpy(&naluDataLength, dataPointer! + UnsafeMutablePointer<Int8>.Stride(offset), lengthInfoSize)
                    //大端转系统端(大端->小端)
                    naluDataLength = CFSwapInt32BigToHost(naluDataLength)
                    //获取到编码好的视频数据 起始位+编码后data
                    var data = Data(capacity: Int(naluDataLength) + lengthInfoSize)
                    data.append(buffer, count: 4) // 先添加起始位
                    //转化pointer；UnsafeMutablePointer<Int8> -> UnsafePointer<UInt8>
                    let naluUnsafePoint = unsafeBitCast(dataPointer, to: UnsafePointer<UInt8>.self)
                    // 再添加编码好的视频数据
                    data.append(naluUnsafePoint + UnsafePointer<UInt8>.Stride(offset + UInt32(lengthInfoSize)) , count: Int(naluDataLength))
                    
                    encoder.callBackQueue.async {
                        encoder.videoEncodeCallback!(data)
                    }
                    offset += (naluDataLength + UInt32(lengthInfoSize))
                    
                }
            }
        }
    }

为什么要判断关键帧呢?
VideoToolBox编码器在每一个关键帧前面都会输出SPS/PPS信息。
如果本帧为关键帧，则可以取出对应的SPS/PPS信息。

为什么要加上起始位，在上一章节已经讲过了

最后不要忘了释放。

    deinit {
        if encodeSession != nil {
            VTCompressionSessionCompleteFrames(encodeSession, untilPresentationTimeStamp: .invalid)
            VTCompressionSessionInvalidate(encodeSession);
            encodeSession = nil;
        }
    }

我们拿到了data/sps/pps可以用来做什么？
1.写入文件得到h264文件
2.用来解码得到未解码的数据
3.网络传输

下面来看看H264的解码...

三、H264解码

解码的三个核心函数：
创建session, VTDecompressionSessionCreate
解码一个frame，VTDecompressionSessionDecodeFrame
销毁解码session，VTDecompressionSessionInvalidate

如果H264码流中I帧错误/丢失，就会导致错误传递，P/B帧单独是完成不了解码工作! 花屏的现象产生.！
VideoToolBox硬编码编码H264帧.I帧! 手动加入SPS/PPS，解码时: 需要使用SPS/PPS数据来对解码器进行初始化!

既然NALU是一个接一个，这需要实时解码：
1.分析NALU数据：前面4个字节是起始位，标识一个NALU的开始！
从第5位才开始来获取，从第5位才是NALU数据类型；

2.获取到第5位数据,转化十进制,，根据表格判断它数据类型；

3.判断好数据类型，才能将NALU送入解码器。（SPS/PPS是不需求解码的）

解码流程：
1.解析数据(NAL Unit) I/P/B...
2.初始化解码器
3.将解析后的 H264 NAL Unit 输给解码器
4.解码完成回调,输出解码数据

解析数据之前，看看我们的解析类的配置：

import UIKit
import VideoToolbox

class VideoH264Decoder: NSObject {

    private var width: Int32 // 采集分辨率的width
    private var height:Int32 // 采集分辨率的height
    
    private var spsData:Data? // sps
    private var ppsData:Data? // pps
    
    private var decompressionSession : VTDecompressionSession? // session
    private var callback :VTDecompressionOutputCallback? // 解码完成后数据回调
    
    private var decodeDesc : CMVideoFormatDescription? // 解码器描述
    private var decodeQueue = DispatchQueue(label: "decode") // 解码队列
    private var callBackQueue = DispatchQueue(label: "decodeCallBack") // 解码完成后回调队列
    
    // 得到源数据的回调
    var videoDecodeCallback:((CVImageBuffer?) -> Void)?
    func SetVideoDecodeCallback(block:((CVImageBuffer?) -> Void)?)  {
        videoDecodeCallback = block
    }
    
    init(width:Int32 = 480,height:Int32 = 640) {
        self.width = width
        self.height = height
    }
}

1.解析数据(NAL Unit)

上一章节提到了当拿到了sps/pps/data之后，可以做一个解码把编码后的数据恢复到源数据。

对传递进来的NALU数据流进行解析，前4位不用管是起始位，获取到第5位数据,转化十进制，根据表格规定判断它的数据类型：

    func decode(data:Data) {
        decodeQueue.async {
            let length:UInt32 = UInt32(data.count)
            self.decodeByte(data: data, size: length)
        }
    }
    private func decodeByte(data:Data, size:UInt32) {
        //数据类型:frame的前4个字节是NALU数据的开始码，也就是00 00 00 01，
        // 将NALU的开始码转为4字节大端NALU的长度信息
        let naluSize = size - 4
        let length : [UInt8] = [
            UInt8(truncatingIfNeeded: naluSize >> 24),
            UInt8(truncatingIfNeeded: naluSize >> 16),
            UInt8(truncatingIfNeeded: naluSize >> 8),
            UInt8(truncatingIfNeeded: naluSize)
            ]
        var frameByte :[UInt8] = length
        [UInt8](data).suffix(from: 4).forEach { (bb) in
            frameByte.append(bb)
        }
        let bytes = frameByte //[UInt8](frameData)
        // 第5个字节是表示数据类型，转为10进制后，7是sps, 8是pps, 5是IDR（I帧）信息
        let type :Int  = Int(bytes[4] & 0x1f)
        switch type{
        case 0x05: // I帧/关键帧
            if initDecoder() {
                decode(frame: bytes, size: size)
            }
        case 0x06:
//            print("增强信息")
            break
        case 0x07: // sps
            spsData = data
        case 0x08: // pps
            ppsData = data
        default:  // p/b帧...
            if initDecoder() {
                decode(frame: bytes, size: size)
            }
        }
    }

如果遇到I帧则去初始化解码器 initDecoder ；然后解码 decode(frame: bytes, size: size) .

注意初始化解码器只需要被初始化一次即可。

2.初始化解码器

CMVideoFormatDescriptionCreateFromH264ParameterSets：根据sps pps设置解码参数
VTDecompressionOutputCallbackRecord：解码回调设置（是一个简单的结构体）
VTDecompressionSessionCreate: 创建session
VTSessionSetProperty: 给session的属性赋值

1.拿到sps/pps数据，设置解码参数
2.设置解码回调参数设置
3.设置解码参数
4.使用前3步的参数，创建解码session
5.给解码session的属性赋值

    private func initDecoder() -> Bool {
        if decompressionSession != nil { return true }
        guard spsData != nil, ppsData != nil else { return false }
//        var frameData = Data(capacity: Int(size))
//        frameData.append(length, count: 4)
//        let point :UnsafePointer<UInt8> = [UInt8](data).withUnsafeBufferPointer({$0}).baseAddress!
//        frameData.append(point + UnsafePointer<UInt8>.Stride(4), count: Int(naluSize))
        //处理sps/pps
        var sps : [UInt8] = []
        [UInt8](spsData!).suffix(from: 4).forEach { (value) in
            sps.append(value)
        }
        var pps : [UInt8] = []
        [UInt8](ppsData!).suffix(from: 4).forEach{(value) in
            pps.append(value)
        }
        
        let spsAndpps:[UnsafePointer<UInt8>] = [sps.withUnsafeBufferPointer{$0}.baseAddress!,
                                                pps.withUnsafeBufferPointer{$0}.baseAddress!]
        let sizes = [sps.count, pps.count]

        /**
        根据sps pps设置解码参数
         allocator: kCFAllocatorDefault 分配器
         parameterSetCount: 2 参数个数
         parameterSetPointers: 参数集指针
         parameterSetSizes: 参数集大小
         nalUnitHeaderLength: nalu nalu start code 的长度 4
         decodeDesc: 解码器描述 - 视频输出格式
         return 状态
        */
        let descriptionState = CMVideoFormatDescriptionCreateFromH264ParameterSets(allocator: kCFAllocatorDefault,
                                                                                   parameterSetCount: 2,
                                                                                   parameterSetPointers: spsAndpps,
                                                                                   parameterSetSizes: sizes,
                                                                                   nalUnitHeaderLength: 4,
                                                                                   formatDescriptionOut: &decodeDesc)
        if descriptionState != noErr {
            print("description创建失败" )
            return false
        }
        //解码回调设置
        /*
         VTDecompressionOutputCallbackRecord 是一个简单的结构体，它带有一个指针 (decompressionOutputCallback)，指向帧解压完成后的回调方法。你需要提供可以找到这个回调方法的实例 (decompressionOutputRefCon)。VTDecompressionOutputCallback 回调方法包括七个参数：
            - decompressionOutputCallback: 回调的引用
            - decompressionOutputRefCon: 帧的引用
            - status: 一个状态标识 (包含未定义的代码)
            - infoFlags: 指示同步/异步解码，或者解码器是否打算丢帧的标识
            - imageBuffer: 实际图像的缓冲
            - presentationTimeStamp: 出现的时间戳
            - presentationTDuration: 出现的持续时间
         */
        setCallBack() // 设置解析完成后的回调，能拿到解码后的源数据
        var callbackRecord = VTDecompressionOutputCallbackRecord(decompressionOutputCallback: callback,
                                                                 decompressionOutputRefCon: unsafeBitCast(self, to: UnsafeMutableRawPointer.self))
        /*
         解码参数:
        * kCVPixelBufferPixelFormatTypeKey:摄像头的输出数据格式
           - kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange，即420v
           - kCVPixelFormatType_420YpCbCr8BiPlanarFullRange，即420f
           - kCVPixelFormatType_32BGRA，iOS在内部进行YUV至BGRA格式转换
         YUV420一般用于标清视频，YUV422用于高清视频，这里的限制让人感到意外。但是，在相同条件下，YUV420计算耗时和传输压力比YUV422都小。
         
        * kCVPixelBufferWidthKey/kCVPixelBufferHeightKey: 视频源的分辨率 width*height
        * kCVPixelBufferOpenGLCompatibilityKey : 它允许在 OpenGL 的上下文中直接绘制解码后的图像，而不是从总线和 CPU 之间复制数据。这有时候被称为零拷贝通道，因为在绘制过程中没有解码的图像被拷贝.
         
         */
        let imageBufferAttributes = [
            kCVPixelBufferPixelFormatTypeKey:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange,
            kCVPixelBufferWidthKey:width,
            kCVPixelBufferHeightKey:height,
//            kCVPixelBufferOpenGLCompatibilityKey:true
            ] as [CFString : Any]
        
        //创建session
        /*!
         VTDecompressionSessionCreate  创建用于解压缩视频帧的会话。  解压后的帧将通过调用OutputCallback发出
            - allocator  内存的会话。通过使用默认的kCFAllocatorDefault的分配器。
            - videoFormatDescription 描述源视频帧
            - videoDecoderSpecification 指定必须使用的特定视频解码器.NULL
            - destinationImageBufferAttributes 描述源像素缓冲区的要求 NULL
            - outputCallback 使用已解压缩的帧调用的回调
            - decompressionSessionOut 指向一个变量以接收新的解压会话
         */
        let state = VTDecompressionSessionCreate(allocator: kCFAllocatorDefault,
                                                 formatDescription: decodeDesc!,
                                                 decoderSpecification: nil,
                                                 imageBufferAttributes: imageBufferAttributes as CFDictionary,
                                                 outputCallback: &callbackRecord,
                                                 decompressionSessionOut: &decompressionSession)
        if state != noErr {
            print("创建decodeSession失败")
        }
        VTSessionSetProperty(self.decompressionSession!, key: kVTDecompressionPropertyKey_RealTime, value: kCFBooleanTrue)
        
        return true
    }

callback是可以接收到解码后的数据的回调，第4步会说setCallBack()做了什么事

3.将解析后的 H264 NAL Unit 输给解码器

CMBlockBufferCreateWithMemoryBlock: 创建CMBlockBuffer
CMSampleBufferCreateReady：创建sampleBuffer
VTDecompressionSessionDecodeFrame:解码数据

1.将NALU数据流装入CMBlockBuffer
2.将CMBlockBuffer装入CMSampleBuffer
3.解码数据

    private func decode(frame:[UInt8],size:UInt32) {
        //
        var blockBuffer :CMBlockBuffer?
        var frame1 = frame
//        var memoryBlock = frame1.withUnsafeMutableBytes({$0}).baseAddress
//        var ddd = Data(bytes: frame, count: Int(size))
        
        /* 创建CMBlockBuffer
         参数1: allocator kCFAllocatorDefault
         参数2: memoryBlock 内容frame
         参数3: blockLength frameSize
         参数4: blockAllocator: Pass NULL
         参数5: customBlockSource Pass NULL
         参数6: offsetToData  数据偏移 0不偏移
         参数7: dataLength 数据长度
         参数8: flags 功能和控制标志
         参数9: newBBufOut blockBuffer地址,不能为空
         */
        let blockState = CMBlockBufferCreateWithMemoryBlock(allocator: kCFAllocatorDefault,
                                                            memoryBlock: &frame1,
                                                            blockLength: Int(size),
                                                            blockAllocator: kCFAllocatorNull,
                                                            customBlockSource: nil,
                                                            offsetToData:0,
                                                            dataLength: Int(size),
                                                            flags: 0,
                                                            blockBufferOut: &blockBuffer)
        if blockState != noErr { print("创建blockBuffer失败") }
//
        var sampleSizeArray :[Int] = [Int(size)]
        var sampleBuffer :CMSampleBuffer?
        //创建sampleBuffer
        /*
         参数1: allocator 分配器,使用默认内存分配, kCFAllocatorDefault
         参数2: dataBuffer.需要编码的数据blockBuffer.不能为NULL
         参数3: formatDescription,视频输出格式
         参数4: numSamples.CMSampleBuffer 个数.
         参数5: numSampleTimingEntries 必须为0,1,numSamples
         参数6: sampleTimingArray.  数组.为空
         参数7: numSampleSizeEntries 默认为1
         参数8: sampleSizeArray
         参数9: sampleBuffer对象
         */
        let readyState = CMSampleBufferCreateReady(allocator: kCFAllocatorDefault,
                                                   dataBuffer: blockBuffer,
                                                   formatDescription: decodeDesc,
                                                   sampleCount: CMItemCount(1),
                                                   sampleTimingEntryCount: CMItemCount(),
                                                   sampleTimingArray: nil,
                                                   sampleSizeEntryCount: CMItemCount(1),
                                                   sampleSizeArray: &sampleSizeArray,
                                                   sampleBufferOut: &sampleBuffer)
        if readyState != noErr { print("Sample Buffer Create Ready faile") }
        //解码数据
        /*
         参数1: 解码session
         sampleBuffer: 源数据 包含一个或多个视频帧的CMsampleBuffer
         flags: 解码标志
         frameRefcon: 解码后数据outputPixelBuffer
         infoFlagsOut: 同步/异步解码标识
         */
        let sourceFrame: UnsafeMutableRawPointer? = nil
        var inforFalg = VTDecodeInfoFlags.asynchronous
        let decodeState = VTDecompressionSessionDecodeFrame(self.decompressionSession!,
                                                            sampleBuffer: sampleBuffer!,
                                                            flags:VTDecodeFrameFlags._EnableAsynchronousDecompression,
                                                            frameRefcon: sourceFrame,
                                                            infoFlagsOut: &inforFalg)
        if decodeState != noErr { print("解码失败") }
    }

4.解码完成回调,输出解码数据

在创建session时候就已经设置了完成编码后的回调callback

    //解码成功的回掉
    private func setCallBack()  {
        callback = { decompressionOutputRefCon,sourceFrameRefCon,status,inforFlags,imageBuffer,presentationTimeStamp,presentationDuration in
            let decoder :VideoH264Decoder = unsafeBitCast(decompressionOutputRefCon, to: VideoH264Decoder.self)
            guard imageBuffer != nil else { return }
            if let block = decoder.videoDecodeCallback  {
                decoder.callBackQueue.async {
                    block(imageBuffer)
                }
            }
        }
    }

iOS 音视频(三) - 视频编码-实现H264编解码

一、了解VideoToolBox(硬编码)

二、H264编码

1.创建session，使用VTCompressionSessionCreate函数

2.设置编码相关参数，使用VTSessionSetProperty函数

3.获取采集数据，开始编码

4.获取编码后数据

三、H264解码

1.解析数据(NAL Unit)

2.初始化解码器

3.将解析后的 H264 NAL Unit 输给解码器

4.解码完成回调,输出解码数据

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读