美文网首页iOS音频开发
IOS H.264流硬解码

IOS H.264流硬解码

作者: copy_farmer | 来源:发表于2020-09-03 20:39 被阅读0次

    1.H.264 介绍

    H264是属于视频的编码层的标准格式,视频编码显然是为了压缩大小

    2.H.264流格式

    在大多数H.264流中,您将收到一组初始的SPS和PPS参数,后跟一个i帧(又名IDR帧或关键帧)NALU。然后您将收到几个P帧的NALU(可能几十个左右),然后是另一组参数(可能与初始参数相同)和一个i帧,更多的P帧等。i帧比P帧。从概念上讲,您可以将i帧视为视频的整个图像,而P帧只是对该i帧所做的更改,直到收到下一个i帧为止。

    图解:

    image

    步骤:

    1.从您的H.264流中生成单个NALU。

    2.使用CMVideoFormatDescriptionCreateFromH264ParameterSets()从SPS和PPS NALU创建CMVideoFormatDescriptionRef

    3.根据“ AVCC”格式重新包装IDR和非IDR帧NALU。

    4.将IDR和非IDR NALU帧打包到CMBlockBuffer中。

    5.将CMBlockBuffer打包到CMSampleBuffer中。

    6.创建一个VTDecompressionSessionRef并将示例缓冲区馈入VTDecompressionSessionDecodeFrame()。

    其他说明:

    H.264流的变化很大。据我了解,NALU起始代码标头有时为3个字节(0x00 00 01),有时为4(0x00 00 00 01)。我的代码适用于4个字节;如果您正在使用3,则需要进行一些更改。

    代码部分:

    1.引入VideoToolbox框架

     #import <VideoToolbox/VideoToolbox.h>
    
    @property (nonatomic, assign) CMVideoFormatDescriptionRef formatDesc;
    @property (nonatomic, assign) VTDecompressionSessionRef decompressionSession;
    @property (nonatomic, retain) AVSampleBufferDisplayLayer *videoLayer;
    @property (nonatomic, assign) int spsSize;
    @property (nonatomic, assign) int ppsSize;
    

    2.解码部分

    -(void) receivedRawVideoFrame:(uint8_t *)frame withSize:(uint32_t)frameSize isIFrame:(int)isIFrame
    {
        OSStatus status;
    
        uint8_t *data = NULL;
        uint8_t *pps = NULL;
        uint8_t *sps = NULL;
    
        // I know what my H.264 data source's NALUs look like so I know start code index is always 0.
        // if you don't know where it starts, you can use a for loop similar to how i find the 2nd and 3rd start codes
        int startCodeIndex = 0;
        int secondStartCodeIndex = 0;
        int thirdStartCodeIndex = 0;
    
        long blockLength = 0;
    
        CMSampleBufferRef sampleBuffer = NULL;
        CMBlockBufferRef blockBuffer = NULL;
    
        int nalu_type = (frame[startCodeIndex + 4] & 0x1F);
        NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
    
        // if we havent already set up our format description with our SPS PPS parameters, we
        // can't process any frames except type 7 that has our parameters
        if (nalu_type != 7 && _formatDesc == NULL)
        {
            NSLog(@"Video error: Frame is not an I Frame and format description is null");
            return;
        }
    
        // NALU type 7 is the SPS parameter NALU
        if (nalu_type == 7)
        {
            // find where the second PPS start code begins, (the 0x00 00 00 01 code)
            // from which we also get the length of the first SPS code
            for (int i = startCodeIndex + 4; i < startCodeIndex + 40; i++)
            {
                if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
                {
                    secondStartCodeIndex = i;
                    _spsSize = secondStartCodeIndex;   // includes the header in the size
                    break;
                }
            }
    
            // find what the second NALU type is
            nalu_type = (frame[secondStartCodeIndex + 4] & 0x1F);
            NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
        }
    
        // type 8 is the PPS parameter NALU
        if(nalu_type == 8)
        {
            // find where the NALU after this one starts so we know how long the PPS parameter is
            for (int i = _spsSize + 4; i < _spsSize + 30; i++)
            {
                if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
                {
                    thirdStartCodeIndex = i;
                    _ppsSize = thirdStartCodeIndex - _spsSize;
                    break;
                }
            }
    
            // allocate enough data to fit the SPS and PPS parameters into our data objects.
            // VTD doesn't want you to include the start code header (4 bytes long) so we add the - 4 here
            sps = malloc(_spsSize - 4);
            pps = malloc(_ppsSize - 4);
    
            // copy in the actual sps and pps values, again ignoring the 4 byte header
            memcpy (sps, &frame[4], _spsSize-4);
            memcpy (pps, &frame[_spsSize+4], _ppsSize-4);
    
            // now we set our H264 parameters
            uint8_t*  parameterSetPointers[2] = {sps, pps};
            size_t parameterSetSizes[2] = {_spsSize-4, _ppsSize-4};
    
            // suggestion from @Kris Dude's answer below
            if (_formatDesc) 
            {
                CFRelease(_formatDesc);
                _formatDesc = NULL;
            }
    
            status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, 
                                                    (const uint8_t *const*)parameterSetPointers, 
                                                    parameterSetSizes, 4, 
                                                    &_formatDesc);
    
            NSLog(@"\t\t Creation of CMVideoFormatDescription: %@", (status == noErr) ? @"successful!" : @"failed...");
            if(status != noErr) NSLog(@"\t\t Format Description ERROR type: %d", (int)status);
    
            // See if decomp session can convert from previous format description 
            // to the new one, if not we need to remake the decomp session.
            // This snippet was not necessary for my applications but it could be for yours
            /*BOOL needNewDecompSession = (VTDecompressionSessionCanAcceptFormatDescription(_decompressionSession, _formatDesc) == NO);
             if(needNewDecompSession)
             {
                 [self createDecompSession];
             }*/
    
            // now lets handle the IDR frame that (should) come after the parameter sets
            // I say "should" because that's how I expect my H264 stream to work, YMMV
            nalu_type = (frame[thirdStartCodeIndex + 4] & 0x1F);
            NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
        }
    
        // create our VTDecompressionSession.  This isnt neccessary if you choose to use AVSampleBufferDisplayLayer
        if((status == noErr) && (_decompressionSession == NULL))
        {
            [self createDecompSession];
        }
    
        // type 5 is an IDR frame NALU.  The SPS and PPS NALUs should always be followed by an IDR (or IFrame) NALU, as far as I know
        if(nalu_type == 5)
        {
            // find the offset, or where the SPS and PPS NALUs end and the IDR frame NALU begins
            int offset = _spsSize + _ppsSize;
            blockLength = frameSize - offset;
            data = malloc(blockLength);
            data = memcpy(data, &frame[offset], blockLength);
    
            // replace the start code header on this NALU with its size.
            // AVCC format requires that you do this.  
            // htonl converts the unsigned int from host to network byte order
            uint32_t dataLength32 = htonl (blockLength - 4);
            memcpy (data, &dataLength32, sizeof (uint32_t));
    
            // create a block buffer from the IDR NALU
            status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold buffered data
                                                        blockLength,  // block length of the mem block in bytes.
                                                        kCFAllocatorNull, NULL,
                                                        0, // offsetToData
                                                        blockLength,   // dataLength of relevant bytes, starting at offsetToData
                                                        0, &blockBuffer);
    
            NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
        }
    
        // NALU type 1 is non-IDR (or PFrame) picture
        if (nalu_type == 1)
        {
            // non-IDR frames do not have an offset due to SPS and PSS, so the approach
            // is similar to the IDR frames just without the offset
            blockLength = frameSize;
            data = malloc(blockLength);
            data = memcpy(data, &frame[0], blockLength);
    
            // again, replace the start header with the size of the NALU
            uint32_t dataLength32 = htonl (blockLength - 4);
            memcpy (data, &dataLength32, sizeof (uint32_t));
    
            status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold data. If NULL, block will be alloc when needed
                                                        blockLength,  // overall length of the mem block in bytes
                                                        kCFAllocatorNull, NULL,
                                                        0,     // offsetToData
                                                        blockLength,  // dataLength of relevant data bytes, starting at offsetToData
                                                        0, &blockBuffer);
    
            NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
        }
    
        // now create our sample buffer from the block buffer,
        if(status == noErr)
        {
            // here I'm not bothering with any timing specifics since in my case we displayed all frames immediately
            const size_t sampleSize = blockLength;
            status = CMSampleBufferCreate(kCFAllocatorDefault,
                                          blockBuffer, true, NULL, NULL,
                                          _formatDesc, 1, 0, NULL, 1,
                                          &sampleSize, &sampleBuffer);
    
            NSLog(@"\t\t SampleBufferCreate: \t %@", (status == noErr) ? @"successful!" : @"failed...");
        }
    
        if(status == noErr)
        {
            // set some values of the sample buffer's attachments
            CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, YES);
            CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
            CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);
    
            // either send the samplebuffer to a VTDecompressionSession or to an AVSampleBufferDisplayLayer
            [self render:sampleBuffer];
        }
    
        // free memory to avoid a memory leak, do the same for sps, pps and blockbuffer
        if (NULL != data)
        {
            free (data);
            data = NULL;
        }
    }
    
    

    3.第一次收到sps和pps时需要创建VTD会话。

    -(void) createDecompSession
    {
        // make sure to destroy the old VTD session
        _decompressionSession = NULL;
        VTDecompressionOutputCallbackRecord callBackRecord;
        callBackRecord.decompressionOutputCallback = decompressionSessionDecodeFrameCallback;
    
        // this is necessary if you need to make calls to Objective C "self" from within in the callback method.
        callBackRecord.decompressionOutputRefCon = (__bridge void *)self;
    
        // you can set some desired attributes for the destination pixel buffer.  I didn't use this but you may
        // if you need to set some attributes, be sure to uncomment the dictionary in VTDecompressionSessionCreate
        NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
                                                          [NSNumber numberWithBool:YES],
                                                          (id)kCVPixelBufferOpenGLESCompatibilityKey,
                                                          nil];
    
        OSStatus status =  VTDecompressionSessionCreate(NULL, _formatDesc, NULL,
                                                        NULL, // (__bridge CFDictionaryRef)(destinationImageBufferAttributes)
                                                        &callBackRecord, &_decompressionSession);
        NSLog(@"Video Decompression Session Create: \t %@", (status == noErr) ? @"successful!" : @"failed...");
        if(status != noErr) NSLog(@"\t\t VTD ERROR type: %d", (int)status);
    }
    

    每当VTD完成解压缩发送给它的任何帧时,都会调用此方法。即使出现错误或丢帧,也会调用此方法。

    void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon,
                                                 void *sourceFrameRefCon,
                                                 OSStatus status,
                                                 VTDecodeInfoFlags infoFlags,
                                                 CVImageBufferRef imageBuffer,
                                                 CMTime presentationTimeStamp,
                                                 CMTime presentationDuration)
    {
        THISCLASSNAME *streamManager = (__bridge THISCLASSNAME *)decompressionOutputRefCon;
    
        if (status != noErr)
        {
            NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
            NSLog(@"Decompressed error: %@", error);
        }
        else
        {
            NSLog(@"Decompressed sucessfully");
    
            // do something with your resulting CVImageBufferRef that is your decompressed frame
            [streamManager displayDecodedFrame:imageBuffer];
        }
    }
    

    我们实际上将sampleBuffer发送到VTD进行解码的地方。

    - (void) render:(CMSampleBufferRef)sampleBuffer
    {
        VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
        VTDecodeInfoFlags flagOut;
        NSDate* currentTime = [NSDate date];
        VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
                                          (void*)CFBridgingRetain(currentTime), &flagOut);
    
        CFRelease(sampleBuffer);
    
        // if you're using AVSampleBufferDisplayLayer, you only need to use this line of code
        // [videoLayer enqueueSampleBuffer:sampleBuffer];
    }
    

    如果使用AVSampleBufferDisplayLayer,请确保在viewDidLoad或其他其他init方法中初始化这样的图层。

    -(void) viewDidLoad
    {
        // create our AVSampleBufferDisplayLayer and add it to the view
        videoLayer = [[AVSampleBufferDisplayLayer alloc] init];
        videoLayer.frame = self.view.frame;
        videoLayer.bounds = self.view.bounds;
        videoLayer.videoGravity = AVLayerVideoGravityResizeAspect;
    
        // set Timebase, you may need this if you need to display frames at specific times
        // I didn't need it so I haven't verified that the timebase is working
        CMTimebaseRef controlTimebase;
        CMTimebaseCreateWithMasterClock(CFAllocatorGetDefault(), CMClockGetHostTimeClock(), &controlTimebase);
    
        //videoLayer.controlTimebase = controlTimebase;
        CMTimebaseSetTime(self.videoLayer.controlTimebase, kCMTimeZero);
        CMTimebaseSetRate(self.videoLayer.controlTimebase, 1.0);
    
        [[self.view layer] addSublayer:videoLayer];
    }
    

    相关文章

      网友评论

        本文标题:IOS H.264流硬解码

        本文链接:https://www.haomeiwen.com/subject/wtgosktx.html