2020-02-18 python+IBM Cloud进行语音转

作者: 菜菜笛 | 来源:发表于2020-02-18 23:32 被阅读0次

2020-02-18 python+IBM Cloud进行语音转
2020-02-18 python+百度AI进行语音转文字，语音
python 百度AI语音合成base64转MP3
2020-02-18疫情在家办公
呼叫中心的实时语音分析
macOS + Sublime Text + Latex 环境配
synchronized底层原理及优化
Service服务详解以及如何使service服务不被杀死
神经网络加速器
Linux 进程状态

快速演示页面，可以在此页面中上传音频并测试：https://speech-to-text-demo后面还有一大串

通过此页面的该按钮来创建应用：

image.png

没有注册IBM Cloud的小伙伴需要先注册账号，然后才能创建资源。
注册账号的时候遇到点小问题，使用163邮箱无法注册，使用qq邮箱可以注册成功。
登录后，点击“创建资源”，资源名为“Speech to Text”。
免费版本，每月可以使用500分钟

image.png
取得服务的使用凭证：

image.png

安装必要模块：

pip install ibm-watson

python代码：

# -*- coding: GBK -*-
import json
from os.path import join, dirname
from ibm_watson import SpeechToTextV1
from ibm_watson.websocket import RecognizeCallback, AudioSource
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

API_KEY = '我是api_key'
API_URL = '我是url'

authenticator = IAMAuthenticator(API_KEY)
speech_to_text = SpeechToTextV1(
    authenticator=authenticator
)

speech_to_text.set_service_url(API_URL)

class MyRecognizeCallback(RecognizeCallback):
    def __init__(self):
        RecognizeCallback.__init__(self)

    def on_data(self, data):
        print(json.dumps(data, indent=2))

    def on_error(self, error):
        print('Error received: {}'.format(error))

    def on_inactivity_timeout(self, error):
        print('Inactivity timeout: {}'.format(error))

myRecognizeCallback = MyRecognizeCallback()

with open(join(dirname(__file__), './.', 'audio-file.flac'),
              'rb') as audio_file:
    audio_source = AudioSource(audio_file)
    speech_to_text.recognize_using_websocket(
        # 提供要转录的音频的AudioSource对象。必需
        audio=audio_source,
        # 文件类型。必需
        content_type='audio/flac',
        # 接口返回的数据通过该回调对象的on_data方法进行处理。必需
        recognize_callback=myRecognizeCallback,
        # 应用的语言模型，默认为en-US_BroadbandModel。中文相关：zh-CN_BroadbandModel, zh-CN_NarrowbandModel
        # model='en-US_BroadbandModel',
        # 根据阈值来查找关键词
        # keywords=['colorado', 'tornado', 'tornadoes'],
        # keywords_threshold=0.5,
        #返回3个阈值最高的结果
        # max_alternatives=3
        )

'''
与上面代码等效的curl：
curl -X POST -u "apikey:我是api_key" ^
--header "Content-Type: audio/flac" ^
--data-binary @C:\Users\wuxd\Desktop\audio-file.flac ^
"我是url/v1/recognize"
'''

代码、详细信息、参数说明都在这里。点、点、点我

关于音频格式：mp3、wav用上面代码进行转换时需要修改文件类型为audio/mp3、audio/wav。
音频格式详细说明

音频文件的大小：