科大讯飞AIUI智能语音体验

作者: 代码坊 | 来源:发表于2017-09-25 11:14 被阅读1412次

    最近调研了一下科大讯飞智能语音服务AIUI,并基于官方Demo二次开发了一个比较简单的demo(Android->AIUI->服务器后处理),体验了一下功能。

    官方的Android Demo中提供了“语音听写”、“语法识别”、“语义理解”、“语音合成”、“声纹密码”等功能,我个人主要使用了“语义理解”和“语音合成”

    之前的文章介绍过Amazon的Alexa,AIUI应该想做类似的事情,但是目前AIUI的功能有限,开发文档也一般(大篇幅介绍网页控制台的操作,很多关键的地方描述的不清楚,比如“自定义技能”、“自定义实体”这些开发者非常关心的功能,开发文档只简单的介绍了一下怎么添加;“自定义实体”是个什么鬼都没说就over了);另外,官方DEMO功能都简单,尤其是第三方应用开发者需要用到的“后处理”功能,资料少之又少,官方只给出来一个简单的不能再简单的例子(只是实现了AIUI将语义分析结果转发给第三方的后处理,后处理收到了就没有了...看到那儿的时候真的很无语啊...后处理收到数据之后,要怎么处理,要用什么格式返回数据,返回的数据怎么用,一点儿没提,o(╯□╰)o服了)

    目前AIUI适合的场景

    语音引导:比如开放技能中“天气”这种通用的功能,查询时不需要跟用户信息关联、有可预知的关键字“天气”等(所谓的“自定义实体”)

    屏幕快照 2017-09-20 下午5.25.21.png

    AIUI不能满足的

    无法提前设置语料库的情况,例如与账户关联的、语音信息录入(某个问诊APP,录入用户姓名等)

    调研之前想实现的功能

    1. 语音引导:类似于ATM这种自助终端,用户通过语音来与终端交互;比如终端问“您好,有什么需要”,用户答“拍照”,然后终端进入拍照功能;

    这种需求目前AIUI是可以满足的,怎么实现,文章后面会介绍

    1. 语音信息录入:通过用户的语音来完成信息录入,比如通过语音录入用户的姓名、年龄、患病情况等;

    目前AIUI不适用这种场景,发音一样字不一样的情况太多了,比如wangxiaoer对应的名字可能是“王小二”、“王晓二”,这种情况下,用语音方式进行信息录入,可就得不偿失了。

    自定义技能

    image.png

    开放技能中没有想要的功能时需要自己实现自定义技能,通过语义理解提取到关键字,在“后处理”服务器去执行一些操作。

    比如在自定义技能中,设置语料“我尚”,当用户说“我尚”之后,AIUI会触发这个自定义技能,其中的query就是“我尚”,后处理服务收到这个语义后,可以做一些操作,返回数据;

    {
        "category": "ISHANG.mHealth_demo:11.0",
        "intentType": "custom",
        "query": "我尚",
        "query_ws": "我/NP//  尚/ADD//",
        "rc": 0,
        "nlis": "true",
        "service": "ISHANG.mHealth_demo",
        "uuid": "atn000167dc@ch60f10d1d86686f2601",
        "vendor": "ISHANG",
        "version": "11.0",
        "semantic": [
            {
                "intent": "init",
                "score": 1,
                "slots": []
            }
        ],
        "sid": "atn000167dc@ch60f10d1d86686f2601",
        "text": "我尚"
    }
    

    这里注意rc字段,0表示语义理解成功,如果语义理解不成功是这样的(rc为4):

    {
        "rc": 4,
        "uuid": "atn00016bf3@ch60f10d1d86f86f2601",
        "sid": "atn00016bf3@ch60f10d1d86f86f2601",
        "text": "大王"
    }
    

    自定义实体

    自定义技能如果用"我是{name}",这个{name}就是一个自定义实体(可以理解为语料库),开放的自定义实体里有省、城市、歌曲名等等,如果没有,就自定义一个“张三”,在语义理解时,会出现在语义槽中

    image.png
    {
        "category": "ISHANG.mHealth_demo:11.0",
        "intentType": "custom",
        "query": "我是张三",
        "query_ws": "我/NP//  是/V_SHI//  张三/NPP//",
        "rc": 0,
        "nlis": "true",
        "service": "ISHANG.mHealth_demo",
        "uuid": "atn00016f72@ch0de90d1d87956f2a01",
        "vendor": "ISHANG",
        "version": "11.0",
        "semantic": [
            {
                "intent": "input_name",
                "score": 1,
                "slots": [
                    {
                        "begin": 2,
                        "end": 4,
                        "name": "name",
                        "normValue": "张三",
                        "value": "张三"
                    }
                ]
            }
        ],
        "sid": "atn00016f72@ch0de90d1d87956f2a01",
        "text": "我是张三"
    }
    

    这里注意,“张三”已经在自定义实体中添加过了,在json中出现在semantic[0].slots[0]这个字段,这就是语义理解的精髓所在了,就是需要你提前在语料库中添加好语料,在语义理解结果中,这个语料就可以单独提出来,在后处理作为业务逻辑参数使用

    但是,语料库必须提前录入好,否则语义理解就失败了;但是有些时候——比如录入姓名——除非将全中国人的姓名做成语料库,否则语义理解就失败了,该怎么办?

    image.png

    如果是一个没有添加的实体,返回如下,rc为4也就是无法理解语义

    image.png
    {
        "rc": 4,
        "uuid": "atn000176af@ch0de90d1d88736f2a01",
        "sid": "atn000176af@ch0de90d1d88736f2a01",
        "text": "我是例子"
    }
    

    后处理

    如果设置了后处理,AIUI服务器会将语义理解的结果转发给后处理服务器,在后处理服务器通过post方法接收AIUI转发请求的函数(或者方法)里,我们可以提取语义理解的结果,做一些查询等操作,然后返回;

    post请求的数据

    关键数据保存在Msg.Content字段,这里要注意的是SessionParams和Msg.Content是Base64编码之后的,使用时需要先解码,解码之后完整的请求数据如下:

    {"MsgId":"cid6f1c2494@ch00270d1c09d20100101","CreateTime":1505803732,"AppId":"59bf6334","UserId":"d3146084944","SessionParams":{"dsrc":"sdk","dts":"1","dtype":"audio","msc.lat":"39.895252","msc.lng":"116.343834","scene":"main","scity":"ch","sid":"cid6f1c2494@ch00270d1c09d2010010","stmid":"audio-16","ver_type":"mobile_phone","wake_id":"15058037304161d1c41c87ab7cd3c"},"UserParams":"","FromSub":"kc","Msg":{"ContentType":"json","Type":"text","Content":{"intent":{"data":{"result":[{"airData":40,"airQuality":"优","city":"北京","date":"2017-09-19","dateLong":1505750400,"exp":{"ct":{"expName":"穿衣指数","level":"热","prompt":"天气热,建议着短裙、短裤、短薄外套、T恤等夏季服装。"}},"humidity":"20%","lastUpdateTime":"2017-09-19 11:39:20","pm25":"13","temp":29,"tempRange":"14℃~30℃","weather":"晴","weatherType":0,"wind":"西北风3-4级","windLevel":1},{"city":"北京","date":"2017-09-20","dateLong":1505836800,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"14℃~27℃","weather":"晴","weatherType":0,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-21","dateLong":1505923200,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"17℃~28℃","weather":"多云","weatherType":1,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-22","dateLong":1506009600,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"15℃~28℃","weather":"晴","weatherType":0,"wind":"西北风微风","windLevel":0},{"city":"北京","date":"2017-09-23","dateLong":1506096000,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"18℃~29℃","weather":"晴转多云","weatherType":0,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-24","dateLong":1506182400,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"19℃~28℃","weather":"阴","weatherType":2,"wind":"东风微风","windLevel":0},{"city":"北京","date":"2017-09-25","dateLong":1506268800,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"19℃~28℃","weather":"多云转阴","weatherType":1,"wind":"东南风微风","windLevel":0}]},"rc":0,"semantic":[{"intent":"QUERY","slots":[{"name":"location.city","value":"CURRENT_CITY","normValue":"CURRENT_CITY"},{"name":"location.poi","value":"CURRENT_POI","normValue":"CURRENT_POI"},{"name":"location.type","value":"LOC_POI","normValue":"LOC_POI"},{"name":"queryType","value":"内容"},{"name":"subfocus","value":"天气状态"}]}],"service":"weather","text":"天气","uuid":"atn00913a37@ch46b50d1c09d46f2a01","used_state":{"state_key":"fg::weather::default::default","state":"default"},"answer":{"text":"\"北京\"今天\"晴\",\"14℃~30℃\",\"西北风3-4级\""},"dialog_stat":"DataValid","sid":"cid6f1c2494@ch00270d1c09d2010010"}}}}
    

    返回格式

    参照开放技能“天气”等json数据返回,把结果放在data或者intent的answer里,其它字段还是用post请求发过来的数据。“天气”语义理解之后的数据如下:

    {
      "data": {
        "result": [
          {
            "airData": 44,
            "airQuality": "优",
            "city": "北京",
            "date": "2017-09-20",
            "dateLong": 1505836800,
            "exp": {
              "ct": {
                "expName": "穿衣指数",
                "level": "热",
                "prompt": "天气热,建议着短裙、短裤、短薄外套、T恤等夏季服装。"
              }
            },
            "humidity": "25%",
            "lastUpdateTime": "2017-09-20 11:07:03",
            "pm25": "10",
            "temp": 24,
            "tempRange": "14℃~27℃",
            "weather": "晴",
            "weatherType": 0,
            "wind": "北风微风",
            "windLevel": 0
          },
          {
            "city": "北京",
            "date": "2017-09-21",
            "dateLong": 1505923200,
            "lastUpdateTime": "2017-09-20 11:07:03",
            "tempRange": "17℃~29℃",
            "weather": "多云",
            "weatherType": 1,
            "wind": "南风微风",
            "windLevel": 0
          },
          {
            "city": "北京",
            "date": "2017-09-22",
            "dateLong": 1506009600,
            "lastUpdateTime": "2017-09-20 11:07:03",
            "tempRange": "13℃~27℃",
            "weather": "晴",
            "weatherType": 0,
            "wind": "西北风微风",
            "windLevel": 0
          },
          {
            "city": "北京",
            "date": "2017-09-23",
            "dateLong": 1506096000,
            "lastUpdateTime": "2017-09-20 11:07:03",
            "tempRange": "18℃~29℃",
            "weather": "晴转多云",
            "weatherType": 0,
            "wind": "南风微风",
            "windLevel": 0
          },
          {
            "city": "北京",
            "date": "2017-09-24",
            "dateLong": 1506182400,
            "lastUpdateTime": "2017-09-20 11:07:03",
            "tempRange": "19℃~28℃",
            "weather": "阴",
            "weatherType": 2,
            "wind": "东风微风",
            "windLevel": 0
          },
          {
            "city": "北京",
            "date": "2017-09-25",
            "dateLong": 1506268800,
            "lastUpdateTime": "2017-09-20 11:07:03",
            "tempRange": "19℃~28℃",
            "weather": "多云转阴",
            "weatherType": 1,
            "wind": "东南风微风",
            "windLevel": 0
          },
          {
            "city": "北京",
            "date": "2017-09-26",
            "dateLong": 1506355200,
            "lastUpdateTime": "2017-09-20 11:07:03",
            "tempRange": "13℃~25℃",
            "weather": "晴",
            "weatherType": 0,
            "wind": "西北风3-4级",
            "windLevel": 1
          }
        ]
      },
      "rc": 0,
      "semantic": [
        {
          "intent": "QUERY",
          "slots": [
            {
              "name": "location.city",
              "value": "CURRENT_CITY",
              "normValue": "CURRENT_CITY"
            },
            {
              "name": "location.poi",
              "value": "CURRENT_POI",
              "normValue": "CURRENT_POI"
            },
            {
              "name": "location.type",
              "value": "LOC_POI",
              "normValue": "LOC_POI"
            },
            {
              "name": "queryType",
              "value": "内容"
            },
            {
              "name": "subfocus",
              "value": "天气状态"
            }
          ]
        }
      ],
      "service": "weather",
      "text": "天气",
      "uuid": "atn00018593@ch60f10d1d8a556f2601",
      "used_state": {
        "state_key": "fg::weather::default::default",
        "state": "default"
      },
      "answer": {
        "text": "\"北京\"今天\"晴\",\"14℃~27℃\",\"北风微风\""
      },
      "dialog_stat": "DataValid",
      "sid": "atn00018593@ch60f10d1d8a556f2601"
    }
    

    DEMO

    Demo地址,Demo包括后处理服务端和Android App

    后处理服务端

    nodejs实现

    • get请求处理方法中,主要是实现了aiui后处理服务器验证
    • post请求处理方法中,实现了一个非常简单的状态机,使用aiui发来的语义结果结合一个code变量,来控制返回什么样的数据;返回数据格式参照开放技能“天气”

    Android App

    基于官方demo完成(在aiui上注册android应用之后,该应用的设置界面可下载工程代码,遗憾的是这货居然是个eclipse工程o(╯□╰)o)

    • 在语义理解demo中基于语音合成功能,加入了语音播报结果功能
    • 说“我尚”,返回“欢迎您使用...请说出自己的名字”,然后说“张三”,返回“您的名字是张三,请说出您的年龄”;然后说“28”,返回“您的年龄是28,谢谢使用,再见”。

    此demo在AIUI上的配置

    应用配置 自定义技能 init 自定义技能 input_name 自定义实体

    相关文章

      网友评论

        本文标题:科大讯飞AIUI智能语音体验

        本文链接:https://www.haomeiwen.com/subject/mpmosxtx.html