论文标题
植物病理学2020挑战数据集,以分类苹果的叶面疾病
The Plant Pathology 2020 challenge dataset to classify foliar disease of apples
论文作者
论文摘要
美国的苹果园受到许多病原体和昆虫的不断威胁。适当,及时的疾病管理部署取决于早期疾病检测。不正确和延迟的诊断可能导致化学药品使用过度或不充分使用,而生产成本,环境和健康影响增加。我们手动捕获了多种苹果叶面疾病的3,651个高质量的现实症状图像,具有可变的照明,角度,表面和噪音。由专家注册的子集,用于为苹果ab,雪松苹果锈和健康的叶子创建一个试点数据集,以寻求“植物病理学挑战”的Kaggle社区; CVPR 2020(计算机视觉和模式识别)的细粒视觉分类(FGVC)研讨会的一部分。我们还培训了有关疾病分类数据的现成的卷积神经网络(CNN),并在持有的测试集中获得了97%的精度。该数据集将有助于开发和部署基于机器学习的自动化植物疾病分类算法,以最终实现快速而准确的疾病检测。我们将继续将图像添加到试点数据集中,以提供更大,更全面的专家注册数据集,以进行未来的Kaggle竞争,并探索更高级的疾病分类和定量方法。
Apple orchards in the U.S. are under constant threat from a large number of pathogens and insects. Appropriate and timely deployment of disease management depends on early disease detection. Incorrect and delayed diagnosis can result in either excessive or inadequate use of chemicals, with increased production costs, environmental, and health impacts. We have manually captured 3,651 high-quality, real-life symptom images of multiple apple foliar diseases, with variable illumination, angles, surfaces, and noise. A subset, expert-annotated to create a pilot dataset for apple scab, cedar apple rust, and healthy leaves, was made available to the Kaggle community for 'Plant Pathology Challenge'; part of the Fine-Grained Visual Categorization (FGVC) workshop at CVPR 2020 (Computer Vision and Pattern Recognition). We also trained an off-the-shelf convolutional neural network (CNN) on this data for disease classification and achieved 97% accuracy on a held-out test set. This dataset will contribute towards development and deployment of machine learning-based automated plant disease classification algorithms to ultimately realize fast and accurate disease detection. We will continue to add images to the pilot dataset for a larger, more comprehensive expert-annotated dataset for future Kaggle competitions and to explore more advanced methods for disease classification and quantification.
