人机对话系统(3)

作者: zidea | 来源:发表于2019-07-24 07:41 被阅读16次

人机对话系统(3)
人机对话系统 (1)
人机对话系统(2)
人机对话系统简介
人机对话系统实现大纲
对话系统的概述
任务驱动型人机对话系统
Pexpect 模块使用说明
谈任务驱动型人机对话系统
基于知识图谱的人机对话系统 | 公开课笔记

人机对话

training = numpy.array(training)
output = numpy.array(output)

将 training 和 output 通过 numpy 转换矩阵供 tensorflow 进行计算，这里我们没有直接使用 TensorFlow 而是使用了对 TensorFlow 的 Api 进行抽象的 tflearn。

tensorflow.reset_default_graph()
print(len(training[0]))
net = tflearn.input_data(shape=[None, len(training[0])])
net = tflearn.fully_connected(net, 8)

tensorflow.reset_default_graph() 重置初始化图
首先是输入层，输入数据形状为 training 的样本数据维度，None 表示并不确定样本个数
fully_connected 表示一次全连接和下一隐藏层

net = tflearn.input_data(shape=[None, len(training[0])])
net = tflearn.fully_connected(net, 8)
net = tflearn.fully_connected(net, 8)
net = tflearn.fully_connected(net, len(output[0]), activation='softmax')

net = tflearn.regression(net)
model = tflearn.DNN(net)

对于每一个样本我们都有 41 输入数据，输入与下一层，该隐藏层具有 8 个神经元，每一输入与下一层神经元进行进行全连接。输出为我们的 tag，

model.fit(training, output, n_epoch=1000, batch_size=8, show_metric=True)

Training Step: 2979  | total loss: 0.32779 | time: 0.011s
| Adam | epoch: 993 | loss: 0.32779 - acc: 0.9299 -- iter: 23/23
--
Training Step: 2982  | total loss: 0.25613 | time: 0.006s
| Adam | epoch: 994 | loss: 0.25613 - acc: 0.9489 -- iter: 23/23
--
Training Step: 2985  | total loss: 0.44631 | time: 0.008s
| Adam | epoch: 995 | loss: 0.44631 - acc: 0.9140 -- iter: 23/23
--
Training Step: 2988  | total loss: 0.35128 | time: 0.007s
| Adam | epoch: 996 | loss: 0.35128 - acc: 0.9261 -- iter: 23/23
--
Training Step: 2991  | total loss: 0.28417 | time: 0.008s
| Adam | epoch: 997 | loss: 0.28417 - acc: 0.9345 -- iter: 23/23
--
Training Step: 2994  | total loss: 0.24740 | time: 0.011s
| Adam | epoch: 998 | loss: 0.24740 - acc: 0.9279 -- iter: 23/23
--
Training Step: 2997  | total loss: 0.58711 | time: 0.007s
| Adam | epoch: 999 | loss: 0.58711 - acc: 0.8560 -- iter: 23/23
--
Training Step: 3000  | total loss: 0.45620 | time: 0.007s
| Adam | epoch: 1000 | loss: 0.45620 - acc: 0.8807 -- iter: 23/23

try:
    with open("data.pickle", "rb") as f:
        words, labels, training, output = pickle.load(f)
except:
    words = []
    labels = []
    docs_x = []
    docs_y = []

    for intent in data["intents"]:
        for pattern in intent["patterns"]:
            wrds = nltk.word_tokenize(pattern)
            words.extend(wrds)
            docs_x.append(wrds)
            docs_y.append(intent["tag"])
            if intent["tag"] not in labels:
                labels.append(intent["tag"])
    words = [stemmer.stem(w.lower()) for w in words if w not in "?"]
    words = sorted(list(set(words)))
    labels = sorted(labels)

    # print(docs_x)
    # print(docs_y)

    training = []
    output = []

    out_empty = [0 for _ in range(len(labels))]

    for x, doc in enumerate(docs_x):
        bag = []
        # print(wrds)
        wrds = [stemmer.stem(w) for w in doc]
        # print(wrds)
        for w in words:
            if w in wrds:
                bag.append(1)
            else:
                bag.append(0)
        print(bag)
        output_row = out_empty[:]
        output_row[labels.index(docs_y[x])] = 1

        training.append(bag)
        # print(len(training))
        output.append(output_row)
        # print(len(output))

    training = numpy.array(training)
    # print(training)
    output = numpy.array(output)
    # print(output)

    with open("data.pickle", "wb") as f:
        pickle.dump((words, labels, training, output), f)


tensorflow.reset_default_graph()
print(len(training[0]))
net = tflearn.input_data(shape=[None, len(training[0])])
net = tflearn.fully_connected(net, 8)
net = tflearn.fully_connected(net, 8)
net = tflearn.fully_connected(net, len(output[0]), activation='softmax')

net = tflearn.regression(net)
model = tflearn.DNN(net)

try:
    model.load("model.tflearn")
except:
    model.fit(training, output, n_epoch=1000, batch_size=8, show_metric=True)
    model.save("model.tflearn")