Peinelt N, Nguyen D, Liakata M. tBERT: Topic models and BERT joining forces for semantic similarity detection[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 7047-7055.
如果所提出的方法并不能很好的覆盖全面的研究领域,可以使用如下叙述:We find that the addition of < topics to BERT > helps particularly with resolving domain-specific cases.
We, therefore, introduce a novel architecture for semantic similarity detection which incorporates topic models and BERT. More specifically, we make the following contributions: We propose tBERT — a simple architecture combining topics with BERT for semantic similarity prediction (section 3). We show in our error analysis that tBERT’s gains are prominent on domain-specific cases, such as those encountered in < CQA > (section 5).
tBERT(topic-informed BERT-based model)结构
- 整体结构
about BERT
对于待检测的句子对,长度为N,长度为M,分别作为的text_a和text_b得到BERT最后一层CLS token的输出,并将其作为句子对的表示,形式化为如下:
对于模型来讲,为内部隐含层的维度。about Topic Model
对于词主题和,关于词的主题分布是由句子中的每个token推断的(这一点可以对照LDA中每个词都对应一个topic-word分布的矩阵。): 对于得到的矩阵然后将它们进行平均,以便在句子级别上获得固定长度的主题表示:
由此可以对应两种< 句子对向量 与 句子级的主题表示 >相结合的表现形式:
(2)for word topics
对于拼接之后的表示,传入到一个隐藏层进行相关权重的调整,然后用softmax layer进行分类。对应的损失函数也仅仅是交叉熵损失。
- 主题模型的选择
about conclusion
In this work, we proposed a flexible framework for combining topic models with BERT. We demonstrated that adding LDA topics to BERT consistently improved performance across a range of semantic similarity prediction datasets. In our qualitative analysis, we showed that these improvements were mainly achieved on examples involving domain-specific words. Future work may focus on how to directly induce topic information into BERT without corrupting pretrained information and whether combining topics with other pretrained contextual models can lead to similar gains.