论文标题
使用BERT和卷积神经网络的宣言和COVID-19新闻发布会的文本分类
Text Classification of Manifestos and COVID-19 Press Briefings using BERT and Convolutional Neural Networks
论文作者
论文摘要
我们使用宣言项目(Volkens等,2020a)的现有人类专家注释的政治宣言公司来构建句子级的政治话语分类器,并将其应用于Ofcovid-19press简介(Chatsiou,2020年)。我们使用手动注释的政治宣言作为培训数据来培训当地主题卷积掌网(CNN)分类器;然后将其应用于Covid-19pressbriefings语料库,以自动对测试语料库中的句子进行分类。我们报告了一系列实验,该实验在预先训练的嵌入式基础上接受了句子级分类任务的CNN。我们表明,CNN与诸如BERT诸如BERT的变压器结合了CNN与其他嵌入(Word2Vec,Glove,Elmo)相结合,并且可以使用预训练的分类器在没有其他培训的情况下对不同的政治文本进行自动分类。
We build a sentence-level political discourse classifier using existing human expert annotated corpora of political manifestos from the Manifestos Project (Volkens et al., 2020a) and applying them to a corpus ofCOVID-19Press Briefings (Chatsiou, 2020). We use manually annotated political manifestos as training data to train a local topic ConvolutionalNeural Network (CNN) classifier; then apply it to the COVID-19PressBriefings Corpus to automatically classify sentences in the test corpus.We report on a series of experiments with CNN trained on top of pre-trained embeddings for sentence-level classification tasks. We show thatCNN combined with transformers like BERT outperforms CNN combined with other embeddings (Word2Vec, Glove, ELMo) and that it is possible to use a pre-trained classifier to conduct automatic classification on different political texts without additional training.
