2019-01-16 音乐特征提取

作者: snow_14b5 | 来源:发表于2019-01-16 14:57 被阅读0次

2019-01-16 音乐特征提取
2019-01-16
【Bitcoin Core系列】IRC聊天室
机器学习笔记6-特征选择和特征提取
feature selection & feature extr
文本特征提取(2)
今天应该干啥
文本特征提取
瘦肉-ScalersTalk第四轮新概念朗读持续力训练Day37
深度学习方法VS传统机器学习

音乐特征的提取感觉比文字和图片略麻烦，因为音乐存在时域、频域的概念，相当于比文字、图片多一个维度。好在目前已有了Librosa开源Python模块，通常用于分析音频信号，但更倾向于音乐。它包括用于构建MIR（音乐信息检索）系统的nuts 和 bolts。
示例和教程：https://librosa.github.io/librosa/。
Librosa 安装：pip install librosa
Librosa 频谱表示：
stft(y[, n_fft, hop_length, win_length, …]) 短时傅里叶变化 (STFT)
istft(stft_matrix[, hop_length, win_length, …]) 逆短时傅里叶变化 (ISTFT).
ifgram(y[, sr, n_fft, hop_length, …]) 计算瞬时频率（按照采样率的比例）来获得复杂的频谱的时间倒数
cqt(y[, sr, hop_length, fmin, n_bins, …]) 计算音频信号的常量Q变换
hybrid_cqt(y[, sr, hop_length, fmin, …]) 计算音频信号的混合常量Q变换
pseudo_cqt(y[, sr, hop_length, fmin, …]) 计算伪常量Q变换
fmt(y[, t_min, n_fmt, kind, beta, …]) The fast Mellin transform (FMT) [R5] of a uniformly sampled signal y.
interp_harmonics(x, freqs, h_range[, kind, …]) 计算均匀采样下的快速梅林变换（FMT）
salience(S, freqs, h_range[, weights, …]) 谐波特征函数
phase_vocoder(D, rate[, hop_length]) 相位编码器
magphase(D) 将一个复制的频谱图D分离成它的幅值S和相位D分量，D=S*P

通过librosa进行若干音乐特征提取的代码如下：

import urllib.request
import json
from pydub import AudioSegment
import wave
import io
import matplotlib.pyplot as plt
import librosa.display
import numpy as np

取部分音频文件即可，按时间段截取

def get_minute_part_wav(main_wav_path, start_time, end_time, part_wav_path):
start_time = (int(start_time.split(':')[0])60+int(start_time.split(':')[1]))1000
end_time = (int(end_time.split(':')[0])60+int(end_time.split(':')[1]))1000
sound = AudioSegment.from_mp3(main_wav_path)
word = sound[start_time:end_time]
word.export(part_wav_path, format="wav")

MP3格式转为wav格式

sound = AudioSegment.from_mp3("d:/music_dev/606149060.mp3")
sound.export("d:/music_dev/606149060.wav",format ='wav')

with wave.open("d:/music_dev/606149060.wav", "rb") as f:
f = wave.open("d:/music_dev/606149060.wav")
print(f.getparams())
get_minute_part_wav("d:/music_dev/606149060.wav", "0:60", "1:30", "d:/music_dev/606149060_130.wav")

x#显示简单波形 ,
sr = librosa.load("d:/music_dev/606149060_130.wav", sr=None)
plt.figure(figsize=(14, 5))
librosa.display.waveplot(x, sr=sr)
plt.savefig("d:/music_dev/606149060_130.png")

显示色度

hop_length = 512
chromagram = librosa.feature.chroma_stft(x, sr=sr, hop_length=hop_length)
plt.figure(figsize=(15, 5))
librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', hop_length=hop_length, cmap='coolwarm')
plt.savefig("d:/music_dev/606149060_130_1.png")

显示过零率

plt.figure(figsize=(14, 5))
librosa.display.waveplot(x, sr=sr)

n0 = 9000
n1 = 9100
plt.figure(figsize=(14, 5))
plt.plot(x[n0:n1])
plt.grid()
plt.savefig("d:/music_dev/606149060_130_2.png")
zero_crossings = librosa.zero_crossings(x[n0:n1], pad=False)
print(sum(zero_crossings))

显示CQT变换后结果

CQT = librosa.amplitude_to_db(librosa.cqt(x, sr), ref=np.max)
plt.subplot(4, 2, 3)
librosa.display.specshow(CQT, y_axis='cqt_note')
plt.colorbar(format='%+2.0f dB')
plt.savefig("d:/music_dev/606149060_130_cqt.png")