相关文献:
iOS 音视频(一) - 基础知识
iOS 音视频(二) - 视频编码-H264概念与原理
iOS 音视频(三) - 视频编码-实现H264编解码
iOS 音视频(四) - 音频AAC编解码
文章结掌握内容:
1.音频基础知识
2.音频编码原理
3.音频压缩编码格式
4.音频ACC编码实现
5.音频ACC解码实现
一、音频基础知识
1.声音
什么是声音?
声音是波,靠物体的振动产生。
声波的三要素:
是频率、振幅、波形(频率代表音阶的高低;振幅代表响度;波形则代表音色)。
-
频率越高,波长就会越短;低频声响的波长则较长。
所以这样的声音更容易绕过障碍物,能量衰减就越小,声音就会传播的越远。 -
响度 就是能量大小的反馈。用不同的力度敲打桌面,声音的大小势必发生变换。
在生活中,我们用分贝描述声音的响度。 -
音色 在同样的频率和响度下,不同的物体发出的声音不一样。
比如钢琴和古筝声音就完全不同。波形的形状决定了声音的音色,因为不同的介质所产生的波形不同,就会产生不一样的音色。
声音的传播
可以通过空气、液体、固体传播。介质不同,会影响声音的传播速度。
- 吸音棉:通过声音反射而产生的嘈杂感,吸音材料选择使用可以衰减入射音源的反射能量,从而对原有声音的保真效果。比如录音棚墙壁上就会使用吸音材质。
- 隔音:主要解决声音穿透而降低主体空间的吵闹感,隔音棉材质可以衰减入射声音的透射能量,从而达到主体空间安静状态。比如KTV墙壁上就会安装隔音棉材料。
2.模拟信号数字化过程 PCM
将模拟信号转换为数字信号所得到的数据就是PCM数据(脉冲编码调制)
。
将模拟信号转换为数字信号的过程:分别是音频采样
,量化
,编码
。
- 音频采样
对模型信号进行采样,采样可以理解为在时间轴上对信号进行数字化。
根据奈斯特定理(采样定理),按比声音最高频率高2倍以上的频率对声音进行采样,这个过程称为AD转换。
人类听到声音的频率范围是20Hz-20KHz,所以采样频率一般是44.1KHz。这样可以保证采样声音达到20KHz也能被数字化,而且经过数字化处理后的声音,音质也不会降低。(44.1KHZ指的是1秒会采样44100次)
- 量化
量化指的是在幅度轴上对信号进行数字化。
就是声音波形的数据是多少位的二进制数据,通常用bit做单位。
比如16比特的二进制信号来表示声音的一个采样,它的取值范围[-32768,32767],一共有65536个值。
如16bit、24bit;16bit量化级记录声音的数据是用16位的二进制数。
因此量化级也是数字声音质量的重要指标。形容数字声音的质量通常就描述为24bit(量化级)、48KHz采样,比如标准CD音乐的质量就是16bit、44.1KHz采样。
- 编码
编码就是按照一定格式记录采样和量化后的数据。
音频裸数据指的是脉冲编码调制(PCM)数据。
想要描述一份PCM数据,需要从如下几个方向出发:
1.量化格式(sampleFormat)
2.采样率(sampleRate)
3.声道数(channel)
举例: CD音质为例
量化格式为16bite,采样率为44100,声道数为2。
这些信息描述CD音质.那么可以CD音质数据,比特率是多少?
44100 * 16 * 2 = 1378.125kbps
那么一分钟的,这类CD音质数据需要占用多少存储空间?
1378.125 * 60 /8/1024 = 10.09MB
如果sampleFormat更加精确或者sampleRate更加密集,那么所占的存储空间就会越大,同时能够描述的声音细节就会更加精确。
存储在这些二进制数据即可理解为将模型信号转化为数字信号。
那么转为数字信号之后,就可以对这些数据进行存储\播放\复制获取其它任何操作。
二、音频编码原理
为什么要做音频编码?
从上面举例得到一分钟需要大概10.1MB的存储空间,从存储的角度或者网络实时传播的角度,这个数据量都是太大了。
音频编码需要剔除什么样的冗余?
压缩编码的原理实际上就是压缩冗余的信号。 冗余信号就是指不能被人耳感知的信号,包括人耳听觉范围之外的音频信号以及被掩盖掉的音频信号。
数字音频信号包含的对人们感受信息影响可以忽略的成分成为冗余:
1.频域冗余
2.时域冗余
3.听觉冗余:人类听得到频率范围20Hz-20KHz以外的音频
(干掉那些人听不到的音频)
人耳朵遮盖效应:主要表现在频域掩盖效应与时域掩盖效应。
掩蔽效应指人的耳朵只对最明显的声音反应敏感,而对于不明显的声音,反应则较不为敏感。例如在声音的整个频率谱中,如果某一个频率段的声音比较强,则人就对其它频率段的声音不敏感了。
1.频域遮蔽效应
假设遮蔽音是单一频率的纯音,它的遮蔽效果会随着音量变大,遮蔽的频率范围也会变大,在频域中1kHz 能量强度约70dB的声音信号会对邻近的三组声音信号产生遮蔽效应,因此对于声音信号而言,能量的强度将会影响所能遮蔽的范围,当能量愈强时,所能遮蔽的范围也会相对越大。
而当一组声音信号高于遮罩门槛,如上图范例所示,在0.5kHz其能量强度达48dB时,此声音信号便又会被人耳察觉。一般来说,弱纯音频率相隔强纯音频率愈接近,就愈容易被掩蔽。因为人脑对听觉有这种遮蔽的反应,而多数的噪音也不只是单一的频率音,所以就会把某个范围的频率也遮蔽了。
2.时域遮蔽效应
时域的遮蔽效应指在时间上相邻的声音之间也有掩蔽现象,其可分为前遮蔽(Pre-masking) 、同步遮蔽(Simultaneous-masking)、后遮蔽(Post-masking)。
产生时域掩蔽的主要原因是人的大脑处理信息需要花费一定的时间。前遮蔽一般来说只有大慨50ms,而后遮蔽一般可以持续达200ms。
假设一个很响的声音后面紧跟着一个很弱的声音,而时间差在200ms之内,弱音就很难听到,相反在弱音后紧跟着一个很强的音,而时间在50ms之内,弱音也是很难听到。当然这个对强弱音的音压差距也会产生不同的遮蔽程度。
三、音频压缩编码格式
1.WAV编码
WAV编码的一种实现方式(其实它有非常多实现方式,但都是不会进行压缩操作)。它是在源PCM数据格式的前面加上44个字节,分别用来描述PCM的采样率、声道数、数据格式等信息。
- 特点:音质非常好,大量软件都支持其播放
- 适合场合:多媒体开发的中间文件,保存音乐和音效素材
2.MP3编码
MP3编码具有不错的压缩比,而且听感也接近于WAV文件,当然在不同的环境下,应该调整合适的参数来达到更好的效果。
- 特点:音质在128Kbit/s以上表现不错,压缩比比较高。大量软件和硬件都支持,兼容性高。
- 适合场合:高比特率下对兼容性有要求的音乐欣赏。
3.AAC编码
AAC是目前比较热门的有损压缩编码技术,并且衍生了LC-AAC、HE-AAC、HE-AAC v2 三种主要编码格式。
-
LC-AAC
是比较传统的AAC,主要应用于中高码率的场景编码(>= 80Kbit/s); -
HE-AAC
主要应用于低码率场景的编码(<= 48Kbit/s)
- 特点:在小于128Kbit/s的码率下表现优异,并且多用于视频中的音频编码。
- 适合场景:于128Kbit/s以下的音频编码,多用于视频中的音频轨的编码。
4.Ogg编码
Ogg编码是一种非常有潜力的编码,在各种码率下都有比较优秀的表现。尤其在低码率场景下,Ogg除了音质好之外,Ogg的编码算法也是非常出色,可以用更小的码率达到更好的音质,128Kbit/s的Ogg比192Kbit/s甚至更高码率的MP3更优质。但目前由于软件还是硬件支持问题,都没法达到与MP3的使用广度。
- 特点:可以用比MP3更小的码率实现比MP3更好的音质,高中低码率下均有良好的表现,兼容不够好,流媒体特性不支持。
- 适合场景:语言聊天的音频消息场景。
AAC封装类型:
-
ADIF
(Audio Data Interchange Format): 音频数据交换格式。这种格式一般应用在将音频通过写文件方式存储在磁盘里,不能进行随机访问,不允许在文件中间开始进行解码。 -
ADTS
(Audio Data Transport Stream): 音频数据传输流。这种格式的特征是用同步字节进行将AAC音频截断,然后可以允许客户端在任何地方进行解码播放,适合网络传输场景
。
将PCM转换为AAC音频流:
- 设置编码器 (codec),
- 采集音频,AVFoundation已将模型信号转化为数字信号得到PCM数据
- 收集PCM数据,传给编码器
- 编码完成后回调callback,写入文件/网络传输
四、音频ACC编码实现
编码配置:
#import <Foundation/Foundation.h>
@interface AudioAccConfig : NSObject
// 码率(96000)
@property (nonatomic, assign) NSInteger bitrate;
// 声道(1 或者 2)
@property (nonatomic, assign) NSInteger channelCount;
// 采样率 (44100)
@property (nonatomic, assign) NSInteger sampleRate;
// 采样点量化(16)
@property (nonatomic, assign) NSInteger sampleSize;
+ (instancetype)defaultConfig;
@end
#import "AudioAccConfig.h"
@implementation AudioAccConfig
+ (instancetype)defaultConfig {
return [[AudioAccConfig alloc] init];
}
- (instancetype)init {
self = [super init];
if (self) {
self.bitrate = 96000;
self.channelCount = 1;
self.sampleRate = 44100;
self.sampleSize = 16;
}
return self;
}
@end
acc编码编码实现:
#import <Foundation/Foundation.h>
#import <AVFoundation/AVFoundation.h>
@class AudioAccConfig;
@interface AACEncoder: NSObject
// 初始化
- (instancetype)init;
- (instancetype)initWithConfig: (AudioAccConfig *)config;
// 不断传入音频数据,并编码后通过block返回编码后的acc数据
- (void)encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer completionBlock:(void (^)(NSData *encodedData, NSError* error))completionBlock;
@end
#import "AACEncoder.h"
#import <AudioToolbox/AudioToolbox.h>
#import "AudioAccConfig.h"
@interface AACEncoder()
@property (nonatomic) dispatch_queue_t encoderQueue;
@property (nonatomic) dispatch_queue_t callbackQueue;
@property (nonatomic, strong) AudioAccConfig *config;
// 音频转换器对象
@property (nonatomic, unsafe_unretained) AudioConverterRef audioConverter;
@property (nonatomic) uint8_t *aacBuffer;
@property (nonatomic) NSUInteger aacBufferSize;
@property (nonatomic) char *pcmBuffer;
@property (nonatomic) size_t pcmBufferSize;
@end
@implementation AACEncoder
- (void)dealloc {
AudioConverterDispose(_audioConverter);
free(_aacBuffer);
}
- (instancetype)init {
return [self initWithConfig:nil];
}
- (instancetype)initWithConfig:(AudioAccConfig *)config {
self = [super init];
if (self) {
// 音频编码配置
_config = config;
if (config == nil) {
_config = AudioAccConfig.defaultConfig;
}
// 初始化队列
_encoderQueue = dispatch_queue_create("AAC Encoder Queue", DISPATCH_QUEUE_SERIAL);
_callbackQueue = dispatch_queue_create("AAC Encoder Callback Queue", DISPATCH_QUEUE_SERIAL);
// 转换器对象/pcm缓冲区/acc缓冲区
_audioConverter = NULL;
_pcmBufferSize = 0;
_pcmBuffer = NULL;
_aacBufferSize = 1024;
_aacBuffer = malloc(_aacBufferSize * sizeof(uint8_t));
memset(_aacBuffer, 0, _aacBufferSize);
}
return self;
}
- (void)encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer completionBlock:(void (^)(NSData * encodedData, NSError* error))completionBlock {
CFRetain(sampleBuffer);
dispatch_async(_encoderQueue, ^{
if (!self->_audioConverter) {
[self setupEncoderFromSampleBuffer:sampleBuffer];
}
// 获取 CMBlockBuffer
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
CFRetain(blockBuffer);
// 获取音频PCM数据地址/数据大小
OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &self->_pcmBufferSize, &self->_pcmBuffer);
NSError *error = nil;
if (status != kCMBlockBufferNoErr) {
error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
}
//NSLog(@"PCM Buffer Size: %zu", _pcmBufferSize);
// _aacBuffer数据填充到AudioBufferList对象去
memset(self->_aacBuffer, 0, self->_aacBufferSize);
AudioBufferList outAudioBufferList = {0};
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = 1;
outAudioBufferList.mBuffers[0].mDataByteSize = (UInt32)self->_aacBufferSize;
outAudioBufferList.mBuffers[0].mData = self->_aacBuffer;
AudioStreamPacketDescription *outPacketDescription = NULL;
UInt32 ioOutputDataPacketSize = 1;
/**
AudioConverterFillComplexBuffer不断地向音频转换器填充数据
_audioConverter:音频编码器
inInputDataProc:编码回调函数(当转换器准备好,会不断调用此回调)
参数3: self
ioOutputDataPacketSize:输出缓冲区大小
outAudioBufferList:需要编码的音频数据
outPacketDescription:输出acc包信息
*/
status = AudioConverterFillComplexBuffer(self->_audioConverter,
inInputDataProc,
(__bridge void *)(self),
&ioOutputDataPacketSize,
&outAudioBufferList,
outPacketDescription);
//NSLog(@"ioOutputDataPacketSize: %d", (unsigned int)ioOutputDataPacketSize);
NSData *data = nil;
if (status == noErr) {
// 获取编码后的aac数据
NSData *rawAAC = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
/**
如果只需要获取音频裸流,则不需要写入ADTS头,直接解码;
如果想要写入文件,则必须添加ADTS头再写入文件。
*/
// 拼接ADTS头
NSData *adtsHeader = [self adtsDataForPacketLength:rawAAC.length];
NSMutableData *fullData = [NSMutableData dataWithData:adtsHeader];
[fullData appendData:rawAAC];
data = fullData;
} else {
error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
}
if (completionBlock) {
dispatch_async(self->_callbackQueue, ^{
completionBlock(data, error);
});
}
CFRelease(sampleBuffer);
CFRelease(blockBuffer);
});
}
- (void)setupEncoderFromSampleBuffer:(CMSampleBufferRef)sampleBuffer {
// 获取输入参数
AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));
// 设置编码输出参数
AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mSampleRate = (Float64)_config.sampleRate;
//outAudioStreamBasicDescription.mSampleRate = inAudioStreamBasicDescription.mSampleRate; // The number of frames per second of the data in the stream, when the stream is played at normal speed. For compressed formats, this field indicates the number of frames per second of equivalent decompressed data. The mSampleRate field must be nonzero, except when this structure is used in a listing of supported formats (see “kAudioStreamAnyRate”).
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // kAudioFormatMPEG4AAC_HE does not work. Can't find `AudioClassDescription`. `mFormatFlags` is set to 0.
outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_LC; // Format-specific flags to specify details of the format. Set to 0 to indicate no format flags. See “Audio Data Format Identifiers” for the flags that apply to each format.
outAudioStreamBasicDescription.mBytesPerPacket = 0; // The number of bytes in a packet of audio data. To indicate variable packet size, set this field to 0. For a format that uses variable packet size, specify the size of each packet using an AudioStreamPacketDescription structure.
outAudioStreamBasicDescription.mFramesPerPacket = 1024; // The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0.
outAudioStreamBasicDescription.mBytesPerFrame = 0; // The number of bytes from the start of one frame to the start of the next frame in an audio buffer. Set this field to 0 for compressed formats. ...
outAudioStreamBasicDescription.mChannelsPerFrame = (uint32_t)_config.channelCount;
//outAudioStreamBasicDescription.mChannelsPerFrame = 1; // The number of channels in each frame of audio data. This value must be nonzero.
outAudioStreamBasicDescription.mBitsPerChannel = 0; // ... Set this field to 0 for compressed formats.
outAudioStreamBasicDescription.mReserved = 0; // Pads the structure out to force an even 8-byte alignment. Must be set to 0.
AudioClassDescription *description = [self
getAudioClassDescriptionWithType:outAudioStreamBasicDescription.mFormatID
fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
/**
创建编码器:
参数1:输入音频格式描述
参数2:输出音频格式描述
参数3: class dessc的数量
参数4:class desc
参数5:创建的编码器
*/
OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription,
&outAudioStreamBasicDescription,
1,
description,
&_audioConverter);
if (status != noErr) {
NSLog(@"setup converter: %d", (int)status);
}
// 设置编码解码质量
/**
kAudioConverterQuality_Max = 0x7F,
kAudioConverterQuality_High = 0x60,
kAudioConverterQuality_Medium = 0x40,
kAudioConverterQuality_Low = 0x20,
kAudioConverterQuality_Min = 0
*/
UInt32 temp = kAudioConverterQuality_High;
//编解码器呈现质量
status = AudioConverterSetProperty(_audioConverter, kAudioConverterCodecQuality, sizeof(temp), &temp);
if (status != noErr) {
NSLog(@"ConverterSetProperty error: %d", (int)status);
}
//设置比特率
uint32_t audioBitrate = (uint32_t)_config.bitrate;
status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, sizeof(audioBitrate), &audioBitrate);
if (status != noErr) {
NSLog(@"ConverterSetProperty error: %d", (int)status);
}
}
// 获取编码器类型描述
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
fromManufacturer:(UInt32)manufacturer
{
static AudioClassDescription desc;
UInt32 encoderSpecifier = type;
OSStatus st;
UInt32 size;
/**
参数1:编码器类型
参数2:类型描述大小
参数3:类型描述
参数4:大小
*/
st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size);
if (st) {
NSLog(@"error getting audio format propery info: %d", (int)(st));
return nil;
}
// 计算aac编码器个数
unsigned int count = size / sizeof(AudioClassDescription);
// 创建一个包含count个的编码器的数组
AudioClassDescription descriptions[count];
//将满足aac编码的编码器的信息写入数组
st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size,
descriptions);
if (st) {
NSLog(@"error getting audio format propery: %d", (int)(st));
return nil;
}
for (unsigned int i = 0; i < count; i++) {
if ((type == descriptions[i].mSubType) &&
(manufacturer == descriptions[i].mManufacturer)) {
memcpy(&desc, &(descriptions[i]), sizeof(desc));
return &desc;
}
}
return nil;
}
// 不断滴向AudioBufferList填充PCM数据
static OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AACEncoder *encoder = (__bridge AACEncoder *)(inUserData);
UInt32 requestedPackets = *ioNumberDataPackets;
//NSLog(@"Number of packets requested: %d", (unsigned int)requestedPackets);
size_t copiedSamples = [encoder copyPCMSamplesIntoBuffer:ioData];
if (copiedSamples < requestedPackets) {
//NSLog(@"PCM buffer isn't full enough!");
*ioNumberDataPackets = 0;
return -1;
}
*ioNumberDataPackets = 1;
//NSLog(@"Copied %zu samples into ioData", copiedSamples);
return noErr;
}
- (size_t)copyPCMSamplesIntoBuffer:(AudioBufferList*)ioData {
size_t originalBufferSize = _pcmBufferSize;
if (!originalBufferSize) {
return 0;
}
ioData->mBuffers[0].mData = _pcmBuffer;
ioData->mBuffers[0].mDataByteSize = (UInt32)_pcmBufferSize;
ioData->mBuffers[0].mNumberChannels = (uint32_t)_config.channelCount;
_pcmBuffer = NULL;
_pcmBufferSize = 0;
return originalBufferSize;
}
/**
* Add ADTS header at the beginning of each and every AAC packet.
* This is needed as MediaCodec encoder generates a packet of raw
* AAC data.
*
* Note the packetLen must count in the ADTS header itself.
* See: http://wiki.multimedia.cx/index.php?title=ADTS
* Also: http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio#Channel_Configurations
**/
- (NSData *)adtsDataForPacketLength:(NSUInteger)packetLength {
int adtsLength = 7;
char *packet = malloc(sizeof(char) * adtsLength);
// Variables Recycled by addADTStoPacket
int profile = 2; //AAC LC
//39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
int freqIdx = 4; //44.1KHz
int chanCfg = 1; //MPEG-4 Audio Channel Configuration. 1 Channel front-center
NSUInteger fullLength = adtsLength + packetLength;
// fill in ADTS data
packet[0] = (char)0xFF; // 11111111 = syncword
packet[1] = (char)0xF9; // 1111 1 00 1 = syncword MPEG-2 Layer CRC
packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
packet[4] = (char)((fullLength&0x7FF) >> 3);
packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
packet[6] = (char)0xFC;
NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];
return data;
}
@end
五、音频ACC解码实现
#import <Foundation/Foundation.h>
#import <AVFoundation/AVFoundation.h>
@class AudioAccConfig;
@interface AACDecoder : NSObject
// 初始化
- (instancetype)init;
- (instancetype)initWithConfig:(AudioAccConfig *)config;
// 拿到acc数据,不断解码出原音频数据
- (void)decodeAACData:(NSData *)data completionBlock:(void (^)(NSData *decodedData, NSError* error))completionBlock;
@end
#import "AACDecoder.h"
#import <AVFoundation/AVFoundation.h>
#import <AudioToolbox/AudioToolbox.h>
#import "AudioAccConfig.h"
typedef struct {
char *data;
UInt32 size;
UInt32 channelCount;
AudioStreamPacketDescription packetDesc;
}AudioUserData;
@interface AACDecoder()
@property (nonatomic, strong) NSCondition *converterCond;
@property (nonatomic) dispatch_queue_t decoderQueue;
@property (nonatomic) dispatch_queue_t callbackQueue;
@property (nonatomic, strong) AudioAccConfig *config;
// 音频转换器对象
@property (nonatomic, unsafe_unretained) AudioConverterRef audioConverter;
@property (nonatomic) char *aacBuffer;
@property (nonatomic) UInt32 aacBufferSize;
@property (nonatomic) AudioStreamPacketDescription *packetDesc;
@end
@implementation AACDecoder
- (instancetype)init {
return [self initWithConfig:nil];
}
- (instancetype)initWithConfig:(AudioAccConfig *)config {
self = [super init];
if (self) {
// 音频编码配置
_config = config;
if (config == nil) {
_config = AudioAccConfig.defaultConfig;
}
// 初始化队列
_decoderQueue = dispatch_queue_create("AAC Decoder Queue", DISPATCH_QUEUE_SERIAL);
_callbackQueue = dispatch_queue_create("AAC Decoder Callback Queue", DISPATCH_QUEUE_SERIAL);
// 转换器对象/pcm缓冲区/acc缓冲区
_audioConverter = NULL;
_aacBufferSize = 0;
_aacBuffer = NULL;
AudioStreamPacketDescription desc = {0};
_packetDesc = &desc;
[self setupDecoder];
}
return self;
}
- (void)decodeAACData:(NSData *)data completionBlock:(void (^)(NSData *, NSError *))completionBlock {
if (!_audioConverter) {return;}
dispatch_async(_decoderQueue, ^{
// 记录aac,作为参数传入解码回调函数
AudioUserData userData = {0};
userData.channelCount = (UInt32)self->_config.channelCount;
userData.data = (char *)[data bytes];
userData.size = (UInt32)data.length;
userData.packetDesc.mDataByteSize = (UInt32)data.length;
userData.packetDesc.mStartOffset = 0;
userData.packetDesc.mVariableFramesInPacket = 0;
// 输出大小和packet个数
UInt32 pcmBufferSize = (UInt32)(2048 * self->_config.channelCount);
UInt32 pcmDataPacketSize = 1024;
// 创建临时容器pcm
uint8_t *pcmBuffer = malloc(pcmBufferSize);
memset(pcmBuffer, 0, pcmBufferSize);
// 输出buffer
AudioBufferList outAudioBufferList = {0};
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = (uint32_t)self->_config.channelCount;
outAudioBufferList.mBuffers[0].mDataByteSize = (UInt32)pcmBufferSize;
outAudioBufferList.mBuffers[0].mData = pcmBuffer;
// 输出描述
AudioStreamPacketDescription outputPacketDesc = {0};
// 配置填充函数,获取输出数据
NSError *error = nil;
OSStatus status = AudioConverterFillComplexBuffer(self->_audioConverter, &AudioDecoderConverterComplexInputDataProc, &userData, &pcmDataPacketSize, &outAudioBufferList, &outputPacketDesc);
if (status != noErr) {
error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
return;
}
// 如果获取到数据
if (outAudioBufferList.mBuffers[0].mDataByteSize > 0) {
NSData *rawData = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
dispatch_async(self->_callbackQueue, ^{
completionBlock(rawData, error);
});
}
free(pcmBuffer);
});
}
- (void)setupDecoder {
// 输出参数pcm
AudioStreamBasicDescription outputAudioDes = {0};
outputAudioDes.mSampleRate = (Float64)_config.sampleRate; // 采样率
outputAudioDes.mChannelsPerFrame = (UInt32)_config.channelCount;//输出声道数
outputAudioDes.mFormatID = kAudioFormatLinearPCM; // 输出格式
outputAudioDes.mFormatFlags = (kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked); // 编码12
outputAudioDes.mFramesPerPacket = 1; //每个Packet帧数
outputAudioDes.mBitsPerChannel = 16; //数据帧中每个通道的采样位数
outputAudioDes.mBytesPerFrame = outputAudioDes.mBitsPerChannel / 8 * outputAudioDes.mChannelsPerFrame; // 每一帧大小(采样率/8*声道数)
outputAudioDes.mBytesPerPacket = outputAudioDes.mBytesPerFrame * outputAudioDes.mFramesPerPacket; // 每个Packet大小(帧大小 * 帧数)
outputAudioDes.mReserved = 0; // 对其方式(0代表8字节对其)
// 输入参数aac
AudioStreamBasicDescription inputAudioDes = {0};
inputAudioDes.mSampleRate = (Float64)_config.sampleRate;
inputAudioDes.mFormatID = kAudioFormatMPEG4AAC;
inputAudioDes.mFormatFlags = kMPEG4Object_AAC_LC;
inputAudioDes.mFramesPerPacket = 1024;
inputAudioDes.mChannelsPerFrame = (UInt32)_config.channelCount;
// 填充输出相关信息
UInt32 inDesSize = sizeof(inputAudioDes);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &inDesSize, &inputAudioDes);
// 获取解码器的信息(只能传入software)
AudioClassDescription *audioClassDesc = [self getAudioClassDescriptionWithType:outputAudioDes.mFormatID
fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
/** 创建解码器
参数1:输入音频格式描述
参数2:输出音频格式描述
参数3:class desc的数量
参数4:class desc
参数5:创建的解码器引用者
*/
OSStatus status = AudioConverterNewSpecific(&inputAudioDes, &outputAudioDes, 1, audioClassDesc, &_audioConverter);
if (status != noErr) {
NSLog(@"error: 硬编码AAC创建失败,status=%d", (int)status);
return;
}
}
// 获取编码器类型描述
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
fromManufacturer:(UInt32)manufacturer
{
static AudioClassDescription desc;
UInt32 encoderSpecifier = type;
OSStatus st;
UInt32 size;
/**
参数1:编码器类型
参数2:类型描述大小
参数3:类型描述
参数4:大小
*/
st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size);
if (st) {
NSLog(@"error getting audio format propery info: %d", (int)(st));
return nil;
}
// 计算aac编码器个数
unsigned int count = size / sizeof(AudioClassDescription);
// 创建一个包含count个的编码器的数组
AudioClassDescription descriptions[count];
//将满足aac编码的编码器的信息写入数组
st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size,
descriptions);
if (st) {
NSLog(@"error getting audio format propery: %d", (int)(st));
return nil;
}
for (unsigned int i = 0; i < count; i++) {
if ((type == descriptions[i].mSubType) &&
(manufacturer == descriptions[i].mManufacturer)) {
memcpy(&desc, &(descriptions[i]), sizeof(desc));
return &desc;
}
}
return nil;
}
// 解码回调函数
static OSStatus AudioDecoderConverterComplexInputDataProc(AudioConverterRef inAudioConverter,
UInt32 *ioNumberDataPackets,
AudioBufferList *ioData,
AudioStreamPacketDescription **outDataPacketDescription,
void *inUserData) {
AudioUserData *audioDecoder = (AudioUserData *)inUserData;
if (audioDecoder->size <= 0) {
ioNumberDataPackets = 0;
return -1;
}
// 填充数据
*outDataPacketDescription = &audioDecoder->packetDesc;
(*outDataPacketDescription)[0].mStartOffset = 0;
(*outDataPacketDescription)[0].mDataByteSize = audioDecoder->size;
(*outDataPacketDescription)[0].mVariableFramesInPacket = 0;
ioData->mBuffers[0].mData = audioDecoder->data;
ioData->mBuffers[0].mDataByteSize = audioDecoder->size;
ioData->mBuffers[0].mNumberChannels = audioDecoder->channelCount;
return noErr;
}
@end
网友评论