美文网首页
Paper Writing - Related Works

Paper Writing - Related Works

作者: 魏鹏飞 | 来源:发表于2020-05-01 08:41 被阅读0次

1. 《Joint Slot Filling and Intent Detection via Capsule Neural Networks》

Intent Detection With recent developments in deep neural networks, user intent detection models (Hu et al., 2009; Xu and Sarikaya, 2013; Zhang et al., 2016; Liu and Lane, 2016; Zhang et al., 2017; Chen et al., 2016; Xia et al., 2018) are proposed to classify user intents given their diversely expressed utterances in the natural language. As a text classification task, the decent performance on utterance-level intent detection usually relies on hidden representations that are learned in the intermediate layers via multiple non-linear transformations.

Recently, various capsule based text classification models are proposed that aggregate word-level features for utterance-level classification via dynamic routing-by-agreement (Gong et al., 2018; Zhao et al., 2018; Xia et al., 2018). Among them, Xia et al. (2018) adopts self-attention to extract intermediate semantic features and uses a capsule-based neural network for intent detection. However, existing works do not study word-level supervisions for the slot filling task. In this work, we explicitly model the hierarchical relationship between words and slots on the word-level, as well as intents on the utterance-level via dynamic routing-by-agreement.

Slot Filling Slot filling annotates the utterance with finer granularity: it associates certain parts of the utterance, usually named entities, with pre- defined slot tags. Currently, the slot filling is usually treated as a sequential labeling task. A recurrent neural network such as Gated Recurrent Unit (GRU) or Long Short-term Memory Network (LSTM) is used to learn context-aware word representations, and Conditional Random Fields (CRF) are used to annotate each word based on its slot type. Recently, Shen et al. (2017); Tan et al. (2017) introduce the self-attention mechanism for CRF-free sequential labeling.

Joint Modeling via Sequence Labeling To overcome the error propagation in the word-level slot filling task and the utterance-level intent detection task in a pipeline, joint models are proposed to solve two tasks simultaneously in a unified framework. Xu and Sarikaya (2013) propose a Convolution Neural Network (CNN) based sequential labeling model for slot filling. The hidden states corresponding to each word are summed up in a classification module to predict the utterance intent. A Conditional Random Field module ensures the best slot tag sequence of the utterance from all possible tag sequences. Hakkani-Tu ̈r et al. (2016) adopt a Recurrent Neural Network (RNN) for slot filling and the last hidden state of the RNN is used to predict the utterance intent. Liu and Lane (2016) further introduce an RNN based encoder-decoder model for joint slot filling and intent detection. An attention weighted sum of all encoded hidden states is used to predict the utterance intent. Some specific mechanisms are designed for RNNs to explicitly encode the slot from the utterance. For example, Goo et al. (2018) utilize a slot-gated mechanism as a special gate function in Long Short-term Memory Network (LSTM) to improve slot filling by the learned intent context vector. However, as the sequence becomes longer, it is risky to simply rely on the gate function to sequentially summarize and compress all slots and context information in a single vector (Cheng et al., 2016).

In this paper, we harness the capsule neural network to learn a hierarchy of feature detectors and explicitly model the hierarchical relationships among word-level slots and utterance-level intent. Also, instead of doing sequence labeling for slot filling, we use a dynamic routing-by-agreement schema between capsule layers to route each word in the utterance to its most appropriate slot type. And we further route slot representations, which are learned dynamically from words, to the most appropriate intent capsule for intent detection.

2. 《Text Level Graph Neural Network for Text Classification》

In this section, we will introduce the related works about GNN and text classification in detail.

Graph Neural Networks
Graph Neural Networks (GNN) has got extensive attention recently (Zhou et al., 2018; Zhang et al., 2018b; Wu et al., 2019). GNN can model non-Euclidean data, while traditional neural networks can only model regular grid data. While many tasks in reality such as knowledge graphs (Ham-aguchi et al., 2017), social networks (Hamilton et al., 2017) and many other research areas (Khalil et al., 2017) are with data in the form of trees or graphs. So GNN are proposed (Scarselli et al., 2009) to apply deep learning techniques to data in graph domain.

Text Classification
Text classification is a classic problem of natural language processing and has a wide range of applications in reality. Traditional text classification like bag-of-words (Zhang et al., 2010), n-gram (Wang and Manning, 2012) and Topic Model (Wallach, 2006) mainly focus on feature engineering and algorithms. With the development of deep learning techniques, more and more deep learning models are applied for text classification. Kim (2014); Liu et al. (2016) applied CNN and RNN into text classification and achieved results which are much better than traditional models.

With the development of GNN, some graph-based classification models are gradually emerging (Hamilton et al., 2017; Velicˇkovic ́ et al., 2017; Peng et al., 2018). Yao et al. (2019) proposed Text-GCN and achieved state-of-the-art results on several mainstream datasets. However, Text-GCN has the disadvantages of high memory consumption and lack of support online training. The model presents in this paper solves the mentioned problems in Text-GCN and achieves better results.

3. 《A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding》

Slot filling can be treated as a sequence labeling task, and the popular approaches are conditional random fields (CRF) (Raymond and Ric- cardi, 2007) and recurrent neural networks (RNN) (Xu and Sarikaya, 2013; Yao et al., 2014). The intent detection is formulated as an utterance classification problem, and different classification methods, such as support vector machine (SVM) and RNN (Haffner et al., 2003; Sarikaya et al., 2011), has proposed to solve it.

Recently, there are some joint models to overcome the error propagation caused by the pipelined approaches. Zhang and Wang (2016) first proposed the joint work using RNNs for learning the correlation between intent and slots. Hakkani-Tu ̈r et al. (2016) proposed a single recurrent neural network for modeling slot filling and intent detection jointly. Liu and Lane (2016) proposed an attention-based neural network for modeling the two tasks jointly. All these models outperform the pipeline models via mutual enhancement between two tasks. However, these joint models did not model the intent information for slots explicitly and just considered their correlation between the two tasks by sharing parameters.

Recently, some joint models have explored incorporating the intent information for slot filling. Goo et al. (2018) utilize a slot-gated mechanism as a special gate function to model the relationship between the intent detection and slot filling. Li et al. (2018) proposed the intent augmented gate mechanism to utilize the semantic correlation between slots and intent. Our framework is significantly different from their models including: (1) both of their approaches utilize the gate mechanism to model the relationship between intent and slots. While in our model, to directly lever- age the intent information in the joint model, we feed the predicted intent information directly into slot filling with Stack-Propagation framework. (2) They apply the sentence-level intent information for each word while we adopt the token-level intent information for slot filling and further ease the error propagation. Wang et al. (2018) propose the Bi-model to consider the cross-impact between the intent and slots and achieve the state-of-the-art result. Zhang et al. (2019) propose a hierarchical capsule neural network to model the the hierarchical relationship among word, slot, and intent in an utterance. E et al. (2019) introduce an SF-ID network to establish the interrelated mechanism for slot filling and intent detection tasks. Compared with their works, our model can directly incorporate the intent information for slot filling explicitly with Stack-Propagation which makes the interaction procedure more interpretable, while their model just interacts with hidden state implicitly between two tasks.

相关文章

网友评论

      本文标题:Paper Writing - Related Works

      本文链接:https://www.haomeiwen.com/subject/qvjfghtx.html