Text 在对话系统

相关论文：A Knowledge-Grounded Neural Conversation Model
Ghazvininejad, Marjan et al. 于2018年发表在AAAI的文章
作者利用Mutil-Task的方法对原有的Seq2Seq的对话模型进行一个扩展，使其能选择基于外部事实或者对话历史所产生的对话内容。其主要思路为：

对于输入文本input，通过如关键词匹配或者其他更先进的技术（如实体链接和命名实体识别）得到facts的查询值。
在预先的facts集合中，选择出和本次对话相关的facts集合。
先按照常规Seq2Seq中的Encoder-Decoder模型中训练。
将相关facts集合输入到另一个encoder中（类似Memory Network），最后对两个encoder输出的状态输入到decoder中产生最终的对话内容。

这模型可以改进的地方有不少，如：

在一些地方适当地加入Attention。如在facts Encoder，最终相关facts的信息是用一个隐向量u来表示。但我认为，一个句子每个时刻的相关facts应该是不同。
在候选相关facts的时候，作者就使用关键词匹配这种暴力方法来处理，但我认为应该还有更好的处理方法。

知识库在对话系统

相关论文：Flexible End-to-End Dialogue System for Knowledge Grounded Conversation
Zhu, Wenya et al. 于2017年发表在CoRRs的文章
作者提出了一个fully data driven的生成对话模型GenDS，其能够基于输入信息（input）和相关知识库（KB）生成响应。GenDS由三部分（Candidate Facts Retriever、Message Encoder、Reply Decoder）组成：

Candidate Facts Retriever：从input中提取entity（E），然后在KB中进行query，将通过relation寻找到的objects和subjects作为Candidate Facts存储为一个集合。
Message Encoder：常见的Seq2Seq的Encoder部分，将input转换为一个representation H
Reply Decoder：在该Decoder中是根据H和candidate facts生成response。此处设计了一个门z knowledge_gate={0,1} 来控制该生成的是knowledge word还是common word

作者把所有单词分为knowledge words（KB中所包含实体）和 common words（其他），他在引入知识时的形式，是类似从知识库三元组中复制词的方法。换言之，模型中的隐向量并没有真正理解知识。
论文中提到的Fact Retriever 挺有意思。我认为这种比对所有词都做索引的方法要好。

知识图谱在对话系统

相关论文：Commonsense Knowledge Aware Conversation Generation with Graph Attention
Zhou, Hao et al. 于2018年发表在IJCAI的文章
作者提出的commonsense knowledge aware conversational model （CCM）中，在encoder中使用static graph attention，在decoder中使用dynamic graph attention：

static graph attention：在Encoder部分的Knowledge Interpreter中使用，加强对post的常识性理解。对于post中的每个x，检索出其在知识图谱中对应的图K(g)，计算K(g)中的每个triple与encoder中的隐状态h的重要性，最后得到对应图向量g。
dynamic graph attention：在Decoder部分的Knowledge Aware Generator中使用。其中有两个隐向量，分别衡量当前各个图的重要性和各个triple的重要性。

最后将所有的隐向量都concat起来，放进Decoder中更新隐状态s。在预测输出词概率的时候，是模型生成预测词表的概率与从知识图谱中生成词的概率的组合。