基于上文训练的自然理解模型,这里介绍使用rasa core 的online leaning 进行对话决策的训练,在进行对话模块训练之前,需要准备两个文件:
- domain.yml # 定义对话所有的意图、槽、实体和系统可能采取的action
- story.md # 对话训练语料, 开始可以用于预训练对话模型,后期加入在线学习过程中生成的语料,更新完善。
domain文件准备
准备领域文件,需要考虑实际场景,任务型对话的对话生成(NLG)采用模板方法简单实用,rasa core 对模板有很好的支持,同样可以在domain文件里定义, 本项目的domian文件如下:
mobile_domain.yml:
slots:
item:
type: text
time:
type: text
phone_number:
type: text
price:
type: text
intents:
- greet
- confirm
- goodbye
- thanks
- inform_item
- inform_package
- inform_time
- request_management
- request_search
- deny
- inform_current_phone
- inform_other_phone
entities:
- item
- time
- phone_number
- price
templates:
utter_greet:
- "您好!,我是机器人小热,很高兴为您服务。"
- "你好!,我是小热,可以帮您办理流量套餐,话费查询等业务。"
- "hi!,人家是小热,有什么可以帮您吗。"
utter_goodbye:
- "再见,为您服务很开心"
- "Bye, 下次再见"
utter_default:
- "您说什么"
- "您能再说一遍吗,我没听清"
utter_thanks:
- "不用谢"
- "我应该做的"
- "您开心我就开心"
utter_ask_morehelp:
- "还有什么能帮您吗"
- "您还想干什么"
utter_ask_time:
- "你想查哪个时间段的"
- "你想查几月份的"
utter_ask_package:
- "我们现在支持办理流量套餐:套餐一:二十元包月三十兆;套餐二:四十元包月八十兆,请问您需要哪个?"
- "我们有如下套餐供您选择:套餐一:二十元包月三十兆;套餐二:四十元包月八十兆,请问您需要哪个?"
utter_ack_management:
- "已经为您办理好了{item}"
actions:
- utter_greet
- utter_goodbye
- utter_default
- utter_thanks
- utter_ask_morehelp
- utter_ask_time
- utter_ask_package
- utter_ack_management
- bot.ActionSearchConsume
rasa core 同样支持自定义action,用于查库和访问外部接口,如这里的bot.ActionSearchConsume,在bot.py里实现,模拟了查库行为:
class ActionSearchConsume(Action):
def name(self):
return 'action_search_consume'
def run(self, dispatcher, tracker, domain):
item = tracker.get_slot("item")
item = extract_item(item)
if item is None:
dispatcher.utter_message("您好,我现在只会查话费和流量")
dispatcher.utter_message("你可以这样问我:“帮我查话费”")
return []
time = tracker.get_slot("time")
if time is None:
dispatcher.utter_message("您想查询哪个月的消费?")
return []
# query database here using item and time as key. but you may normalize time format first.
dispatcher.utter_message("好,请稍等")
if item == "流量":
dispatcher.utter_message("您好,您{}共使用{}二百八十兆,剩余三十兆。".format(time, item))
else:
dispatcher.utter_message("您好,您{}共消费二十八元。".format(time))
return []
其中的一些utter_message的地方同样可以用actions的方式定义在domain文件了,这里的demo意在更全面的覆盖框架的用法。
story文件准备
Story 文件中包含各种各样的对话流程,用于训练对话管理模型,由于缺少真实的对话数据,一开始往往采用其他方法收集数据(例如从传统对话管理方法的系统中获取真实数据)。这里使用通过与机器对话的方式来生成对话语料,教机器什么时候说什么话。只能感叹 数据的准备(标注)和完善是一个消耗人力物力的工作。
一个story的例子:
## Story 12132
* greet
- utter_greet
* request_search{"time": "上个月", "item": "消费"}
- action_search_consume
- utter_ask_morehelp
* deny
- utter_goodbye
- export
具体story的格式要求参考官方文档 ,下面通过初始对话语料,测试强化学习(online learning),并生成对话语料。
online learning
运行bot.py脚本:
python bot.py online_train
同样是调用了rasa core 的 Python接口,具体代码可以在项目代码中看到。从输出中可以看到其使用的kerse 网络结构(单层的LSTM):
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
masking_1 (Masking) (None, 2, 31) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 32) 8192
_________________________________________________________________
dense_1 (Dense) (None, 11) 363
_________________________________________________________________
activation_1 (Activation) (None, 11) 0
=================================================================
Total params: 8,555
Trainable params: 8,555
Non-trainable params: 0
接着就开始了对话教学之旅,以其中一部分为例:
Bot loaded. Type a message and press enter :
您好
------
Chat history:
bot did: None
bot did: action_listen
user said: 您好
whose intent is: greet
we currently have slots: item: None, phone_number: None, price: None, time: None
------
The bot wants to [utter_greet] due to the intent. Is this correct?
1. Yes
2. No, intent is right but the action is wrong
3. The intent is wrong
0. Export current conversations as stories and quit
1
你好!,我是小热,可以帮您办理流量套餐,话费查询等业务。
------
Chat history:
bot did: action_listen
user did: greet
bot did: utter_greet
we currently have slots: item: None, phone_number: None, price: None, time: None
------
The bot wants to [action_listen]. Is this correct?
1. Yes.
2. No, the action is wrong.
0. Export current conversations as stories and quit
1
Next user input:
帮我看看上个月的话费
------
Chat history:
bot did: utter_greet
bot did: action_listen
user said: 帮我看看上个月的话费
whose intent is: request_search
with time: 上个月
with item: 消费
we currently have slots: item: 消费, phone_number: None, price: None, time: 上个月
------
The bot wants to [utter_ask_time] due to the intent. Is this correct?
1. Yes
2. No, intent is right but the action is wrong
3. The intent is wrong
0. Export current conversations as stories and quit
2
------
Chat history:
bot did: utter_greet
bot did: action_listen
user said: 帮我看看上个月的话费
whose intent is: request_search
with time: 上个月
with item: 消费
we currently have slots: item: 消费, phone_number: None, price: None, time: 上个月
------
what is the next action for the bot?
0 action_listen 0.05
1 action_restart 0.00
2 utter_greet 0.01
3 utter_goodbye 0.02
4 utter_default 0.00
5 utter_thanks 0.02
6 utter_ask_morehelp 0.04
7 utter_ask_time 0.68
8 utter_ask_package 0.04
9 utter_ack_management 0.01
10 action_search_consume 0.12
10
thanks! The bot will now [action_search_consume]
-----------
好,请稍等
您好,您上个月共消费二十八元。
------
...
...
由于训练是在线进行,模型实时更新,对话结束后可以选择export将内容导出到指定文件。
The bot wants to [action_listen]. Is this correct?
1. Yes.
2. No, the action is wrong.
0. Export current conversations as stories and quit
0
File to export to (if file exists, this will append the stories) [stories.md]:
INFO:rasa_core.policies.online_policy_trainer:Stories got exported to '/home/zqh/mygit/rasa_chatbot/stories.md'.
可以将新生成的stories.md有选择的加到data/mobile_story.md下 ,并使用新生成的mobile_story.md文件重新训练模型:
python bot.py train-dialogue
得到models目录下对话管理模型:
models/
├── dialogue
│ ├── domain.json
│ ├── domain.yml
│ ├── featurizer.json
│ ├── keras_arch.json
│ ├── keras_policy.json
│ ├── keras_weights.h5
│ ├── memorized_turns.json
│ └── policy_metadata.json
└── ivr
├── demo
├── entity_extractor.dat
├── entity_synonyms.json
├── intent_classifier.pkl
├── metadata.json
└── training_data.json
测试对话管理
虽然对话数据有限,模型不稳定,我们仍然可以测试训练的模型:
python bot.py run
一个测试结果如下:
Bot loaded. Type a message and press enter :
我想查一下我的话费
您想查询哪个月的消费?
四月的
好,请稍等
您好,您四月共消费二十八元。
您还想干什么
当然更加复杂的对话需要更多的对话语料,不过,至此,整个过程的演示基本完成。
小结
使用rasa 构建对话系统比较方便,可以感觉到系统的好坏大程度的依赖于数据的质量与数量。对于项目的对话管理部分仍然有很多地方需要改进,比如一个任务完成后状态的reset;加入简单的闲聊与任务的区分,提高用户体验。
原创文章,转载注明出处。
更多关注公众号:
网友评论
Traceback (most recent call last):
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa_nlu\model.py", line 66, in load
data = utils.read_json_file(metadata_file)
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa_nlu\utils\__init__.py", line 208, in read_json_file
content = read_file(filename)
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa_nlu\utils\__init__.py", line 202, in read_file
with io.open(filename, encoding=encoding) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'models/ivr/demo\\metadata.json'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "bot.py", line 120, in <module>
interpreter=RasaNLUInterpreter("models/ivr/demo"),
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa_core\interpreter.py", line 221, in __init__
self._load_interpreter()
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa_core\interpreter.py", line 237, in _load_interpreter
self.interpreter = Interpreter.load(self.model_directory)
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa_nlu\model.py", line 270, in load
model_metadata = Metadata.load(model_dir)
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa_nlu\model.py", line 71, in load
"from '{}'. {}".format(abspath, e))
rasa_nlu.model.InvalidProjectError: Failed to load model metadata from 'D:\基于rasa的对话系统4\_rasa_chatbot-master\models\ivr\demo\metadata.json'. [Errno 2] No such file or directory: 'models/ivr/demo\\metadata.json'
Using TensorFlow backend.
Traceback (most recent call last):
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\pykwalify\core.py", line 76, in __init__
self.source = yaml.load(stream)
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\ruamel\yaml\main.py", line 637, in load
loader = Loader(stream, version, preserve_quotes=preserve_quotes)
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\ruamel\yaml\loader.py", line 46, in __init__
Reader.__init__(self, stream, loader=self)
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\ruamel\yaml\reader.py", line 80, in __init__
self.stream = stream # type: Any # as .read is called
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\ruamel\yaml\reader.py", line 125, in stream
self.determine_encoding()
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\ruamel\yaml\reader.py", line 169, in determine_encoding
self.update_raw()
File "C:\Users\Cf\AppData\Local\Programs\Python\Python36\lib\site-packages\ruamel\yaml\reader.py", line 278, in update_raw
data = self.stream.read(size)
UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 418: illegal multibyte sequence
During handling of the above exception, another exception occurred:
python -m rasa_nlu.train -c nlu_config.yml --data data/nlu_data.md -o models
--fixed_model_name nlu --project current --verbose
然后这样修改:
python -m rasa_nlu.train --config mobile_nlu_model_config.json --data data/mobile_nlu_data.json --path models
对话模型是:
python -m rasa_core.train -d domain.yml -s data/stories.md -o models/current/dialogue --epochs 200
该如何修改呢?