美文网首页视频知识
AAC音频编码 相关的原理和设置

AAC音频编码 相关的原理和设置

作者: wo不懂 | 来源:发表于2017-09-11 16:48 被阅读360次

    AAC(Advanced Audio Coding),中文名:高级音频编码,出现于1997年,基于MPEG-2的音频编码技术。由Fraunhofer IIS、杜比实验室AT&TSony等公司共同开发,目的是取代MP3格式。2000年,MPEG-4标准出现后,AAC重新集成了其特性,加入了SBR技术和PS技术,为了区别于传统的MPEG-2 AAC又称为MPEG-4 AAC。

    iOS平台支持AAC编码器,主要使用AudioToolbox中的AudioConverter API。之所以做AAC编码器是因为在做一个HLS的功能,HLS要求的TS文件,需要视频采用H264编码,音频采用AAC编码。H264可以使用硬件或软件编码器,前面已经介绍。AAC也可以使用硬件或者软件编码,iOS全都支持。

    首先需要创建一个Converter,也就是一个AAC Encoder,使用如下接口:

    extern OSStatus

    AudioConverterNew(      const AudioStreamBasicDescription*  inSourceFormat,

    const AudioStreamBasicDescription*  inDestinationFormat,

    AudioConverterRef*                  outAudioConverter)      __OSX_AVAILABLE_STARTING(__MAC_10_1,__IPHONE_2_0);

    输入参数分别是源和目的的数据格式。

    在AAC编码的场景下,源格式就是采集到的PCM数据,目的格式就是AAC。

    AudioStreamBasicDescription inAudioStreamBasicDescription;

    //    FillOutASBDForLPCM()

    inAudioStreamBasicDescription.mFormatID = kAudioFormatLinearPCM;

    inAudioStreamBasicDescription.mSampleRate = 44100;

    inAudioStreamBasicDescription.mBitsPerChannel = 16;

    inAudioStreamBasicDescription.mFramesPerPacket = 1;

    inAudioStreamBasicDescription.mBytesPerFrame = 2;

    inAudioStreamBasicDescription.mBytesPerPacket = inAudioStreamBasicDescription.mBytesPerFrame * inAudioStreamBasicDescription.mFramesPerPacket;

    inAudioStreamBasicDescription.mChannelsPerFrame = 1;

    inAudioStreamBasicDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsNonInterleaved;

    inAudioStreamBasicDescription.mReserved = 0;

    AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...

    outAudioStreamBasicDescription.mChannelsPerFrame = 1;

    outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC;

    UInt32 size = sizeof(outAudioStreamBasicDescription);

    AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &outAudioStreamBasicDescription);

    OSStatus status = AudioConverterNew(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, &_audioConverter);

    if(status != 0) {NSLog(@"setup converter failed: %d", (int)status);}

    这样就创建了AAC编码器,默认情况下,Apple会创建一个硬件编码器,如果硬件不可用,会创建软件编码器。

    经过我的测试,硬件AAC编码器的编码时延很高,需要buffer大约2秒的数据才会开始编码。而软件编码器的编码时延就是正常的,只要喂给1024个样点,就会开始编码。

    那么如何在创建的时候指定使用软件编码器呢?需要用到下面的接口:

    - (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type

    fromManufacturer:(UInt32)manufacturer

    {

    static AudioClassDescription desc;

    UInt32 encoderSpecifier = type;

    OSStatus st;

    UInt32 size;

    st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,

    sizeof(encoderSpecifier),

    &encoderSpecifier,

    &size);

    if (st) {

    NSLog(@"error getting audio format propery info: %d", (int)(st));

    return nil;

    }

    unsigned int count = size / sizeof(AudioClassDescription);

    AudioClassDescription descriptions[count];

    st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,

    sizeof(encoderSpecifier),

    &encoderSpecifier,

    &size,

    descriptions);

    if (st) {

    NSLog(@"error getting audio format propery: %d", (int)(st));

    return nil;

    }

    for (unsigned int i = 0; i < count; i++) {

    if ((type == descriptions[i].mSubType) &&

    (manufacturer == descriptions[i].mManufacturer)) {

    memcpy(&desc, &(descriptions[i]), sizeof(desc));

    return &desc;

    }

    }

    return nil;

    }

    AudioClassDescription *desc = [self getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC

    fromManufacturer:kAppleSoftwareAudioCodecManufacturer];

    OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, desc, &_audioConverter);

    如果要正确的编码,编码码率参数是必须设置的。否则编码时会返回560226676错误码(!dat)。

    UInt32 ulBitRate = 64000;

    UInt32 ulSize = sizeof(ulBitRate);

    status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, ulSize, &ulBitRate);

    需要注意,AAC并不是随便的码率都可以支持。比如如果PCM采样率是44100KHz,那么码率可以设置64000bps,如果是16K,可以设置为32000bps。

    创建完成Converter和设置完Bitrate之后,可以查询一下最大编码输出的大小,后续会用到。

    UInt32 value = 0;

    size = sizeof(value);

    AudioConverterGetProperty(_audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &value);

    获取出来的Value表示编码器最大输出的包大小。

    然后调用AudioConverterFillCOmplexBuffer进行编码:

    AudioBufferList outAudioBufferList = {0};

    outAudioBufferList.mNumberBuffers = 1;

    outAudioBufferList.mBuffers[0].mNumberChannels = 1;

    outAudioBufferList.mBuffers[0].mDataByteSize = value;//value是上面查询到的值

    outAudioBufferList.mBuffers[0].mData = new int8[value];

    UInt32 ioOutputDataPacketSize = 1;

    status = AudioConverterFillComplexBuffer(_audioConverter, inInputDataProc, (__bridge void *)(self), &ioOutputDataPacketSize, &outAudioBufferList, NULL);

    编码接口中,inInputDataProc是一个输入数据的回调函数。用来喂PCM数据给Converter,ioOutputDataPacketSize为1表示编码产生1帧数据即返回。outAudioBufferList用来存放编码后的数据。

    inInputDataProc中的处理如下:

    static OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)

    {

    AACEncoder *encoder = (__bridge AACEncoder *)(inUserData);

    UInt32 requestedPackets = *ioNumberDataPackets;

    uint8_t *buffer;

    uint32_t bufferLength = requestedPackets * 2;

    uint32_t bufferRead;

    bufferRead = [encoder.pcmPool readBuffer:&buffer withLength:bufferLength];

    if (bufferRead == 0) {

    *ioNumberDataPackets = 0;

    return -1;

    }

    ioData->mBuffers[0].mData = buffer;

    ioData->mBuffers[0].mDataByteSize = bufferRead;

    ioData->mNumberBuffers = 1;

    ioData->mBuffers[0].mNumberChannels = 1;

    *ioNumberDataPackets = bufferRead >> 1;

    return noErr;

    }

    pcmPool是一个用于存放PCM数据的环形缓冲区。

    因为采集输入每次不一定有1024样点,所以可以将数据缓存起来,再满足1024样点时再调用编码。

    另外,对于TS文件来说,每个AAC数据需要增加一个adts头,adts头是一个7bit的数据,通过adts可以得知AAC数据的编码参数,方便解码器进行解码。

    adts头的计算方法如下:

    - (NSData*) adtsDataForPacketLength:(NSUInteger)packetLength {

    int adtsLength = 7;

    char *packet = (char *)malloc(sizeof(char) * adtsLength);

    // Variables Recycled by addADTStoPacket

    int profile = 2;  //AAC LC

    //39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;

    int freqIdx = 8;  //16KHz

    int chanCfg = 1;  //MPEG-4 Audio Channel Configuration. 1 Channel front-center

    NSUInteger fullLength = adtsLength + packetLength;

    // fill in ADTS data

    packet[0] = (char)0xFF; // 11111111  = syncword

    packet[1] = (char)0xF9; // 1111 1 00 1  = syncword MPEG-2 Layer CRC

    packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));

    packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));

    packet[4] = (char)((fullLength&0x7FF) >> 3);

    packet[5] = (char)(((fullLength&7)<<5) + 0x1F);

    packet[6] = (char)0xFC;

    NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];

    return data;

    }

    相关文章

      网友评论

        本文标题:AAC音频编码 相关的原理和设置

        本文链接:https://www.haomeiwen.com/subject/wesvrxtx.html