论文标题
重新规定无监督的风格转移作为释义
Reformulating Unsupervised Style Transfer as Paraphrase Generation
论文作者
论文摘要
Modern NLP将样式转移的任务定义为修改给定句子的样式而不明显更改其语义,这意味着样式转移系统的输出应是其输入的释义。但是,许多现有的系统据称是为样式传输而设计的,固有地通过属性传输固有地扭曲了输入的含义,该属性会改变语义属性,例如情感。在本文中,我们将无监督的样式转移重新制定为释义生成问题,并在自动生成的释义数据上基于微调审计的语言模型提出了一种简单的方法。尽管它很简单,但我们的方法在人类和自动评估上都大大优于最先进的风格转移系统。我们还调查了23样式转移论文,并发现现有的自动指标可以轻松地被镀金并提出固定变体。最后,我们通过收集11种不同样式的1500万句子的大型数据集来转移更真实的样式转移设置,我们将其用于对系统的深入分析。
Modern NLP defines the task of style transfer as modifying the style of a given sentence without appreciably changing its semantics, which implies that the outputs of style transfer systems should be paraphrases of their inputs. However, many existing systems purportedly designed for style transfer inherently warp the input's meaning through attribute transfer, which changes semantic properties such as sentiment. In this paper, we reformulate unsupervised style transfer as a paraphrase generation problem, and present a simple methodology based on fine-tuning pretrained language models on automatically generated paraphrase data. Despite its simplicity, our method significantly outperforms state-of-the-art style transfer systems on both human and automatic evaluations. We also survey 23 style transfer papers and discover that existing automatic metrics can be easily gamed and propose fixed variants. Finally, we pivot to a more real-world style transfer setting by collecting a large dataset of 15M sentences in 11 diverse styles, which we use for an in-depth analysis of our system.
