论文标题
联合进行性知识蒸馏和无监督的域适应
Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation
论文作者
论文摘要
当前,设计和操作数据的分布以及较大的计算复杂性的差异是在现实世界应用中采用CNN的限制因素。例如,人员重新识别系统通常依赖于一组分布式的摄像机,每个相机都有不同的捕获条件。这可以转化为源(例如实验室设置)和目标(例如操作摄像头)域之间的相当大的变化。鉴于为每个目标域中捕获用于微调的图像数据的成本,无监督的域适应性(UDA)已成为适应CNN的流行方法。此外,提供高度准确性的最先进的深度学习模型通常依赖于实时应用程序太复杂的体系结构。尽管最近提出了几种压缩和UDA方法来克服这些局限性,但它们不允许优化CNN同时解决这两者。在本文中,我们提出了一个未开发的方向 - CNN的联合优化提供了一个压缩模型,该模型适用于给定的目标域。特别是,提出的方法通过利用源和目标数据来执行从复杂的教师模型到紧凑的学生模型的无监督知识蒸馏(KD)。它还通过逐步向学生传授域不变特征,而不是直接在目标域数据上调整紧凑的模型,从而改善了现有的UDA技术。我们的方法与最新的压缩和UDA技术进行了比较,使用两个流行的UDA - Office31和ImageClef-DA的流行分类数据集。在两个数据集中,结果表明我们的方法可以达到最高的准确性,同时需要相当或更低的时间复杂性。
Currently, the divergence in distributions of design and operational data, and large computational complexity are limiting factors in the adoption of CNNs in real-world applications. For instance, person re-identification systems typically rely on a distributed set of cameras, where each camera has different capture conditions. This can translate to a considerable shift between source (e.g. lab setting) and target (e.g. operational camera) domains. Given the cost of annotating image data captured for fine-tuning in each target domain, unsupervised domain adaptation (UDA) has become a popular approach to adapt CNNs. Moreover, state-of-the-art deep learning models that provide a high level of accuracy often rely on architectures that are too complex for real-time applications. Although several compression and UDA approaches have recently been proposed to overcome these limitations, they do not allow optimizing a CNN to simultaneously address both. In this paper, we propose an unexplored direction -- the joint optimization of CNNs to provide a compressed model that is adapted to perform well for a given target domain. In particular, the proposed approach performs unsupervised knowledge distillation (KD) from a complex teacher model to a compact student model, by leveraging both source and target data. It also improves upon existing UDA techniques by progressively teaching the student about domain-invariant features, instead of directly adapting a compact model on target domain data. Our method is compared against state-of-the-art compression and UDA techniques, using two popular classification datasets for UDA -- Office31 and ImageClef-DA. In both datasets, results indicate that our method can achieve the highest level of accuracy while requiring a comparable or lower time complexity.
