论文标题
通过深度对齐聚类发现新意图
Discovering New Intents with Deep Aligned Clustering
论文作者
论文摘要
在对话系统中发现新意图是至关重要的任务。大多数现有方法在将先验知识从已知意图转移到新意图方面受到限制。他们在提供高质量的监督信号方面也很难学习群友好的功能,以分组未标记的意图。在这项工作中,我们提出了一种有效的方法,可以借助有限的已知意图数据来发现新的意图。首先,我们利用一些已知的意图样本作为先验知识来预先培训模型。然后,我们执行K均值以产生群集分配为伪标签。此外,我们提出了一种对齐策略,以解决聚类分配过程中标签不一致问题的策略。最后,我们在对齐的伪标签的监督下学习意图表示。有了未知数量的新意图,我们通过消除低信仰意图群体来预测意图类别的数量。在两个基准数据集上进行的广泛实验表明,我们的方法更强大,并且对最新方法进行了实质性改进。这些代码在https://github.com/thuiar/deepaligned-clustering上发布。
Discovering new intents is a crucial task in dialogue systems. Most existing methods are limited in transferring the prior knowledge from known intents to new intents. They also have difficulties in providing high-quality supervised signals to learn clustering-friendly features for grouping unlabeled intents. In this work, we propose an effective method, Deep Aligned Clustering, to discover new intents with the aid of the limited known intent data. Firstly, we leverage a few labeled known intent samples as prior knowledge to pre-train the model. Then, we perform k-means to produce cluster assignments as pseudo-labels. Moreover, we propose an alignment strategy to tackle the label inconsistency problem during clustering assignments. Finally, we learn the intent representations under the supervision of the aligned pseudo-labels. With an unknown number of new intents, we predict the number of intent categories by eliminating low-confidence intent-wise clusters. Extensive experiments on two benchmark datasets show that our method is more robust and achieves substantial improvements over the state-of-the-art methods. The codes are released at https://github.com/thuiar/DeepAligned-Clustering.
