论文标题
乌鸦对:用于测量蒙版语言模型社会偏见的挑战数据集
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
论文作者
论文摘要
审慎的语言模型,尤其是蒙版语言模型(MLMS)已看到许多NLP任务的成功。但是,有充分的证据表明,他们使用了他们经过训练的语料库中无疑存在的文化偏见,暗中造成了偏见的危害。为了在美国针对受保护的人群群体的语言模型中衡量某些形式的社会偏见,我们介绍了众包刻板印象对基准(Crows-pairs)。乌鸦对有1508个例子,涵盖了刻板印象,这些观念涉及九种类型的偏见,例如种族,宗教和年龄。在乌鸦对中,一个模型带有两个句子:一个句子更刻板印象,另一个是刻板印象的句子。数据着重于有关历史上处于弱势群体的刻板印象,并将它们与优越的群体进行了对比。我们发现,所有三个广泛使用的MLMS我们都评估了以crows折叠表达每个类别的刻板印象的基本偏爱句子。随着构建较少有偏见的模型进步的工作,该数据集可以用作评估进度的基准。
Pretrained language models, especially masked language models (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicitly creating harm with biased representations. To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. In CrowS-Pairs a model is presented with two sentences: one that is more stereotyping and another that is less stereotyping. The data focuses on stereotypes about historically disadvantaged groups and contrasts them with advantaged groups. We find that all three of the widely-used MLMs we evaluate substantially favor sentences that express stereotypes in every category in CrowS-Pairs. As work on building less biased models advances, this dataset can be used as a benchmark to evaluate progress.
