论文标题
用二维卷积神经网络进行分类用于乳腺癌诊断
Classification with 2-D Convolutional Neural Networks for breast cancer diagnosis
论文作者
论文摘要
乳腺癌是女性最常见的癌症。对具有临床记录的癌症/非癌症患者的分类需要高灵敏度和特异性,以进行可接受的诊断测试。但是,最新的分类模型 - 卷积神经网络(CNN)不能与以1-D格式表示的临床数据一起使用。 CNN旨在处理一组二维矩阵,其元素与相邻元素(例如图像数据中)显示出一定的相关性。相反,除了时间序列数据外,不能与CNN一起使用的数据示例(除了时间序列数据之外),而是与其他分类模型(例如人工神经网络或RandomForest)一起使用的数据示例。我们已经提出了一些新型的数据争吵的预处理方法,这些方法将1-D数据向量转换为2-D图形图像,并在CNN上处理的字段之间有适当的相关性。我们测试了威斯康星州原始乳腺癌(WBC)和威斯康星州诊断乳腺癌(WDBC)数据集的方法。据我们所知,这项工作是针对非时代序列数据的数据转换的新颖的。使用VGGNET-16处理的转换数据显示了WBC数据集的竞争结果,并优于WDBC数据集的其他已知方法。
Breast cancer is the most common cancer in women. Classification of cancer/non-cancer patients with clinical records requires high sensitivity and specificity for an acceptable diagnosis test. The state-of-the-art classification model - Convolutional Neural Network (CNN), however, cannot be used with clinical data that are represented in 1-D format. CNN has been designed to work on a set of 2-D matrices whose elements show some correlation with neighboring elements such as in image data. Conversely, the data examples represented as a set of 1-D vectors -- apart from the time series data -- cannot be used with CNN, but with other classification models such as Artificial Neural Networks or RandomForest. We have proposed some novel preprocessing methods of data wrangling that transform a 1-D data vector, to a 2-D graphical image with appropriate correlations among the fields to be processed on CNN. We tested our methods on Wisconsin Original Breast Cancer (WBC) and Wisconsin Diagnostic Breast Cancer (WDBC) datasets. To our knowledge, this work is novel on non-image to image data transformation for the non-time series data. The transformed data processed with CNN using VGGnet-16 shows competitive results for the WBC dataset and outperforms other known methods for the WDBC dataset.
