Hyperspectral image detection of wheat seed purity based on SMOTE-UVE-SVM
摘要: 为了解决基于高光谱成像技术的小麦种子纯度检测过程中样本不均衡及波段信息冗余导致纯度检测模型性能下降的问题, 提出了一种融合合成少数类过采样技术(SMOTE)、非信息变量剔除(UVE)和支持向量机(SVM)的种子纯度高光谱检测模型。该模型利用SMOTE算法对小麦种子少数类(杂质)样本进行扩充, 改善样本的不均衡性; 同时利用UVE对高维的高光谱特征进行选择, 并构建SVM模型作为分类器, 以进一步提高分类的性能。结果表明, 5类小麦种子的平均准确率、精确率和负样本检出率分别达到95.98%、94.94%和89.32%, 较传统方法分别提高了3.89%、7.18%和12.42%。所提出的方法在基于高光谱成像技术的小麦种子纯度检测中具有较好的应用前景。
Abstract: In order to solve the problem, the performance of the wheat seed purity detection model decreased due to sample imbalance and band information redundancy in the process of hyperspectral imaging. A seed purity hyperspectral detection model was proposed by combining the synthetic minority oversampling technique (SMOTE) with uninformative variables elimination (UVE) and support vector machine (SVM). In this model, the SMOTE was used to expand the minority class (impurity) samples of the wheat seeds to improve the sample imbalance. At the same time, the UVE was used to select the high-dimensional hyperspectral features, and the SVM model was constructed to further reduce the risk of model overfitting caused by feature redundancy. Results showed that: The average accuracy, precision, and negative sample detection rate of the five types of wheat seeds are 95.98%, 94.94%, and 89.32%, respectively, which are 3.89%, 7.18%, and 12.42% higher than the traditional methods, respectively. The proposed method has a good application prospect in the detection of wheat seed purity based on hyperspectral imaging technology. -
表 1 5个品种小麦详细信息
Table 1 Details of 5 varieties of wheat
variety origin place kinship gluten value JM22 Shandong 935024/935106 medium gluten XM26 Henan Xinmai9408/Jinan17 strong gluten JM44 Shandong 954072/Jinan17 strong gluten BN4199 Henan Bainonggaoguang3709F2/BainongAK58 medium gluten ZM33 Henan Zhengmai366/BainongAK58 strong gluten 表 2 吸收峰在小麦种子的分布范围及影响因素
Table 2 Distribution range and influencing factors of absorption peak in wheat seeds
表 3 不均衡样本下小麦种子纯度检测结果
Table 3 Results of wheat seed purity under unbalanced samples
positive samples SVDD KNN SVM A/% P/% tND/% A/% P/% tND/% A/% P/% tND/% JM22 64.00 79.11 67.00 84.03 81.35 52.40 93.07 90.68 79.20 XM26 64.33 83.45 77.00 79.20 76.74 37.70 89.80 87.01 70.20 JM44 56.33 74.13 63.00 81.40 78.83 45.10 92.23 89.65 76.90 BN4199 62.33 87.18 85.00 82.70 80.01 49.80 95.50 84.09 87.50 ZM33 60.00 76.32 64.00 82.33 80.13 51.60 89.87 87.38 70.70 average 61.40 80.04 71.20 81.93 79.41 47.32 92.09 87.76 76.90 表 4 SMOTE负样本扩充后小麦种子纯度检测结果
Table 4 Results of SMOTE seed purity after SMOTE sample extension
positive samples SMOTE-KNN SMOTE-SVM A/% P/% tND/% A/% P/% tND/% JM22 88.87 89.26 77.20 94.80 93.17 85.30 XM26 82.63 85.28 69.10 94.80 93.78 86.70 JM44 94.33 92.56 84.00 94.40 92.57 83.69 BN4199 88.57 89.74 78.60 96.53 95.49 90.60 ZM33 84.67 86.86 72.60 95.47 94.60 88.60 average 87.81 88.74 76.30 95.20 93.92 86.98 表 5 5个品种小麦种子分别在4种模型中的平均检测结果
Table 5 Average results of 5 wheat varieties in 4 models
model A/% P/% tND/% number SMOTE-UVE-KNN 91.12 90.09 78.50 71.00 SMOTE-SPA-KNN 90.50 89.62 77.61 74.00 SMOTE-UVE-SVM 95.98 94.94 89.32 71.00 SMOTE-SPA-SVM 95.30 94.12 87.52 74.00 表 6 5个品种小麦种子在SMOTE-UVE-SVM模型的检测结果
Table 6 Test results of 5 varieties of wheat seeds in SMOTE-UVE-SVM model
positive samples SMOTE-UVE-SVM A/% P/% tND/% number JM22 96.13 94.92 89.30 75.00 XM26 95.30 94.49 88.30 68.00 JM44 95.07 93.22 85.40 81.00 BN4199 97.23 96.46 92.70 72.00 ZM33 96.17 95.60 90.90 57.00 average 95.98 94.94 89.32 71.00 -
