基于改进YOLOv4算法的水果识别检测研究

裴瑞景; 王硕; 王华英

doi:10.7510/jgjs.issn.1001-3806.2023.03.018

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名

邮箱

手机号码

标题

留言内容

验证码

基于改进YOLOv4算法的水果识别检测研究

河北工程大学数理科学与工程学院, 邯郸 056107

作者简介: 裴瑞景(1997-), 男, 硕士研究生, 主要研究方向为机器视觉、深度学习.

通讯作者: 王华英, pbxsyingzi@126.com

基金项目:

河北省创新能力提升计划资助项目 20540302D

中图分类号: TP751;TP183

Research on fruit recognition detection algorithm based on improved YOLOv4

College of Mathematical Science and Engineering, Hebei University of Engineering, Handan 056107, China

Corresponding author: WANG Huaying, pbxsyingzi@126.com

CLC number: TP751;TP183

摘要: 为了解决目前水果识别检测方法效率低、误检率高、通用性低、实时性差等问题, 提出了一种基于改进的你只用看一遍(YOLO)统一框架的实时目标检测YOLOv4算法的水果识别检测方法。首先在主干网络的基础上增加高效通道注意力机制, 增强网络提取图像语义信息能力; 其次用内卷算子替换主干网络中跨级局部模块连接处卷积层, 减小了模型大小, 增强了网络预测性能; 最后在路径聚合网络基础上添加残差模块, 加快网络收敛速度的同时防止了网络梯度爆炸。数据集选取生活中常见的火龙果、橙子、葡萄、青芒等10种水果, 拍摄共获得6670张图片。结果表明, 本文中的方法均值平均精度(MAP)为99.1%, 准确率为95.62%, 传输帧数为41.67/s; MAP相比YOLOv4提升了15.3%。该研究满足高检测精度和检测速度要求, 对水果识别精度的提高具有重要的参考价值。

Abstract: In order to solve the problems of low efficiency, high false detection rate, low versatility, poor real-time performance of the current fruit identification and detection methods, a fruit recognition detection method based on improved you only look once (YOLO) YOLOv4 algorithm was proposed in this study. Firstly, an efficient channel attention was added to the backbone network to enhance the network's ability to extract semantic information from images. Secondly, the convolutional layers at the cross stage partial block junction in the backbone network were replaced by involutions, which reduced the model size and enhanced the network prediction performance. Finally, residual modules were added to the feature fusion network path aggregation network to speed up network convergence and prevent network gradient explosion. The datasets selected 10 kinds of fruits common in life: dragon fruits, oranges, grapes, green mangoes and so on with a total of 6670 pictures. The experiments show that the mean average precision (MAP) of the proposed method is 99.10%, the precision is 95.62%, and the number of frames transmitted is 41.67/s, respectively. MAP is improved by 15.3% compared with YOLOv4. This study meets the requirements of high detection accuracy and detection speed and has important reference value for improving the accuracy of fruit identification.

Key words:

model

MAP/%

size/Mbit

number of frames transmitted/s^-1

YOLOv3-SPP

68.00

323

35.13

YOLOv4

84.10

488

43.47

YOLOv5

89.20

55.8

45.23

the algorithm of this paper

99.40

373

41.67

model

MAP/%

number of frames transmitted/s^-1

YOLOv4

84.10

43.47

YOLOv4+involution

90.70

47.52

YOLOv4+ECA

89.42

40.30

YOLOv4+Res-PANet

85.64

44.56

YOLOv4+involution+ECA

93.12

42.14

the algorithm of this paper

99.40

41.67

基于改进YOLOv4算法的水果识别检测研究

通讯作者: 王华英, pbxsyingzi@126.com

作者简介: 裴瑞景(1997-), 男, 硕士研究生, 主要研究方向为机器视觉、深度学习

河北工程大学数理科学与工程学院, 邯郸 056107

收稿日期: 2022-04-11

录用日期: 2022-04-28

网络出版日期: 2023-05-25

基金项目: 河北省创新能力提升计划资助项目 20540302D

关键词:

全文HTML

引言

每年我国在水果采摘方面消耗大量的人力物力，人工智能的迅速发展使得机器人采摘水果成为可能^[1-4]，其中水果图像识别在水果机器人采摘技术中占据重要地位，识别精度的高低代表着能否实现对水果的精准采摘。因此，研究水果图像识别对水果自动化采摘具有重大意义。

传统水果图像识别技术多数利用一些水果的颜色、纹理、形状特征去定位水果区域。例如SUN等人^[5]利用信息最大化(attention-based information maximization，AIM)算法粗略定位绿苹果区域, 随后通过融合光照不变图像和被裁剪图像的R分量, 精确定位绿苹果区域，准确率达到86.91%。YU等人^[6]利用彩色深度(red-green-blue depth，RGB-D)相机采集荔枝图像，在随机森林二元分类模型中引入多尺度检测和非极大值抑制算法，识别荔枝准确率为89.92%。WU等人^[7]提出了一种结合颜色和几何特征的水果分割方法, 利用水果像素特征粗分割，然后利用点云簇的视点特征直方图(viewpoint feature histogram，VFH)精分割水果区域，该方法精准度为80.09%。

以上传统技术受环境复杂性影响，只能考虑某一种水果的特点，没有设计出适用性广泛的特征提取模型，同时也无法满足水果识别实时性。近年，具有非线性表达能力强、泛化性能好等优点的深度学习开始进入研究者们的视野^[8]。DONG等人^[9]利用带有卷积神经网络特征的区域模型(regions with convolutional neural network features，RCNN)设计出一种水果抓取机器人。ZHAI等人^[10]改进经典卷积模型将全连接替换为归一化指数函数softmax，从而实现对水果的分类，但这两个模型只有少量的特征提取层，很大程度上限制其水果图像特征提取的能力，仅在简单的背景下识别率高。HUANG等人^[11]提出一种改进的残差网络用于水果检测，在此基础上添加多尺度采样层进行特征提取，缺点在于数据集图像背景简单，没有进一步检测复杂背景下的水果。PENG等人^[12]提出一种改进单发多盒探测器模型(single shot multibox detector，SSD)模型用于水果识别，将视觉几何组结构(visual geometry group，VGG)VGG-16替换为残差网络(residual network，ResNet)ResNet-101结构，并运用随机梯度下降法(stochastic gradient descent，SGD)方法优化模型，检测速度快，平均精度提升至88.6%，但SSD模型缺少特征融合，水果特征提取较少，识别率较差。WANG等人^[13]提出一种改进的你只用看一遍(you only look once，YOLO)统一框架的实时目标检测YOLOv3水果识别模型, 用组归一化层(group normalization，GN)替换批归一化层(batch normalization，BN)方法，优化运算参数，但是YOLOv3中特征金字塔网络(feature pyramid networks，FPNet)特征融合效果较差，不能充分提取到水果特征，平均识别率仅为85.91%。

经分析可知，以上的水果检测方法都存在一定的缺陷，识别率低或检测速度不满足实时性，因此本文作者针对识别率低和实时性问题提出一种改进的YOLOv4模型，通过对主干网络结构修改以及加入内卷算子, 以此提高水果识别的精度及其鲁棒性，为水果识别技术提供一定的思路。

4. 结论

水果识别算法的发展趋势要求识别准确率高，识别速度快。本文作者在CSPDarknet53中添加ECA注意力机制，通过分配特征图中每个通道不同权重优化主干网络特征提取效果，丰富了模型识别水果种类的性能；用内卷算子替换CSPBlock连接处的卷积层，使得水果检测模型精度提高同时内存消耗减少；在PANet增加残差块，避免了可能会出现的梯度消失状况稳定网络的性能。实验数据表明，本文中的改进模型相比较于YOLOv4，MAP提高了15.30%，达到99.10%，模型大小减小115 Mbit，检测效果有明显的提升，传输帧数为41.67/s，满足工业实时性要求。在接下来的研究中将会着重改进特征融合网络结构，尝试增添不同的特征融合网络结构，在提高MAP的基础上进一步减小模型尺寸和参数的数量，并且在工程中实现。

参考文献 (23)

[1]	SEYED I S, HOSSEIN K. A deep neural network approach towards real-time on-branch fruit recognition for precision horticulture[J]. Expert Systems with Applications, 2020, 159(30): 113594.
[2]	KANG H W, ZHOU H Y, WANG X, et al. Real-time fruit recognition and grasping estimation for robotic apple harvesting[J]. Sensors, 2020, 20(19): 5670. doi: 10.3390/s20195670
[3]	LI Q W, JIA W K, SUN M L, et al. A novel green apple segmentation algorithm based on ensemble U-Net under complex orchard environment[J]. Computers and Electronics in Agriculture, 2021, 180(6): 105900.
[4]	ALTAHERI H, ALSULAIMAN M, MUHAMMAD G. Date fruit classification for robotic harvesting in a natural environment using deep learning[J]. IEEE Access, 2019, 7: 117115-117133. doi: 10.1109/ACCESS.2019.2936536
[5]	SUN S Sh, JIANG M, LIANG N, et al. Combining an information-maximization-based attention mechanism and illumination invariance theory for the recognition of green apples in natural scenes[J]. Multimedia Tools and Applications, 2020, 79(37/38): 1-27.
[6]	YU L Y, XIONG J T, FANG X Q, et al. A litchi fruit recognition method in a natural environment using RGB-D images[J]. Biosystems Engineering, 2021, 204(1): 50-63.
[7]	WU G, LI B, ZHU Q B, et al. Using color and 3D geometry features to segment fruit point cloud and improve fruit recognition accuracy[J]. Computers and Electronics in Agriculture, 2020, 174(6): 105475.
[8]	陈超, 齐峰. 卷积神经网络的发展及其在计算机视觉领域中的应用综述[J]. 计算机科学, 2019, 46(3): 69-79. CHEN Ch, QI F. Review on development of convolution neural and its application in computer vision[J]. Computer Science, 2019, 46(3): 69-79(in Chinese).
[9]	董戈. 基于深度学习和图像处理的水果收获机器人抓取系统[J]. 农机化研究, 2021, 43(3): 260-264. doi: 10.3969/j.issn.1003-188X.2021.03.046 DONG G. Fruit harvesting robot handling system based on deep learning and image processing[J]. Journal of Agricultural Mechanization Research, 2021, 43(3): 260-264(in Chinese). doi: 10.3969/j.issn.1003-188X.2021.03.046
[10]	翟超飞, 马宇亮, 赵德金. 卷积神经网络水果识别[J]. 南方农机, 2021, 52(10): 59-60. doi: 10.3969/j.issn.1672-3872.2021.10.021 ZHAI Ch F, MA Y L, ZHAO D J. Convolutional neural network fruit recognition[J]. China Southern Agricultural Machinery, 2021, 52(10): 59-60(in Chinese). doi: 10.3969/j.issn.1672-3872.2021.10.021
[11]	黄玉富, 朴燕, 张汉辉. 基于多尺度特征融合的水果图像识别算法研究[J]. 长春理工大学学报(自然科学版), 2021, 44(1): 87-94. doi: 10.3969/j.issn.1672-9870.2021.01.013 HUANG Y F, PIAO Y, ZHANG H H. Research on fruit image recognition algorithm based on multi-scale feature fusion[J]. Journal of Changchun University of Science and Technology(Natural Science Edition), 2021, 44(1): 87-94(in Chinese). doi: 10.3969/j.issn.1672-9870.2021.01.013
[12]	彭红星, 黄博, 邵园园, 等. 自然环境下多类水果采摘目标识别的通用改进SSD模型[J]. 农业工程学报, 2018, 34(16): 155-162. doi: 10.11975/j.issn.1002-6819.2018.16.020 PENG H X, HUANG B, SHAO Y Y, et al. A general improved SSD model for target recognition of multiple types of fruit picking in natural environments[J]. Transactions of the Chinese Society of Agricultural Engineering, 2018, 34(16): 155-162(in Chinese). doi: 10.11975/j.issn.1002-6819.2018.16.020
[13]	王辉, 张帆, 刘晓凤. 基于DarkNet-53和YOLOv3的水果图像识别[J]. 东北师大学报(自然科学版), 2020, 52(4): 60-65. WANG H, ZHANG F, LIU X F. Fruit image recognition based on DarkNet-53 and YOLOv3[J]. Journal of Northeast Normal University(Natural Science Edition), 2020, 52(4): 60-65(in Chinese).
[14]	钟志峰, 夏一帆, 周冬平, 等. 基于改进YOLOv4的轻量化目标检测算法[J]. 计算机应用, 2021, 40(10): 32-39. ZHONG Zh F, X Y F, ZHOU D P, et al. Lightweight object detection algorithm based on improved YOLOv4[J]. Journal of Computer Applications, 2021, 40(10): 32-39(in Chinese).
[15]	傅隆生, 冯亚利, ELKAMIL Tola, 等. 基于卷积神经网络的田间多簇猕猴桃图像识别方法[J]. 农业工程学报, 2018, 34(2): 205-211. FU L Sh, FENG Y L, ELKAMIL T, et al. Image recognition method of multi-cluster kiwifruit in the field based on convolutional neural network[J]. Transactions of the Chinese Society of Agricultural Engineering, 2018, 34(2): 205-211(in Chinese).
[16]	LI D, HU J, WANG C H, et al. Involution: Inverting the inherence of convolution for visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. New York, USA: IEEE, 2021: 12321-12330.
[17]	TAN M X, PANG R M, LE Q V. Efficientdet: Scalable and efficient object detection[C]//CVPR 2020: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2020: 10781-10790.
[18]	熊俊涛, 刘振, 汤林越, 等. 自然环境下绿色柑橘视觉检测技术研究[J]. 农业机械学报, 2018, 49(4): 45-52. XIONG J T, LIU Zh, TANG L Y, et al. Research on visual inspection technology of green citrus in natural environment[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(4): 45-52(in Chinese).
[19]	FANG W, WANG L, REN P M. Tinier-YOLO: A real-time object detection method for constrained environments[J]. IEEE Access, 2019, 8: 1935-1944.
[20]	刘春妹, 高洪民, 王学田, 等. 基于深度学习的水果图像识别系统[J]. 微波学报, 2020, 36(s1): 427-430. LIU Ch M, GAO H M, WANG X T, et al. Fruit image recognition system based on deep learning[J]. Journal of Microwaves, 2020, 36(s1): 427-430(in Chinese).
[21]	柳长安, 冯雪菱, 孙长浩, 等. 基于改进麻雀算法的最大2维熵分割方法[J]. 激光技术, 2022, 46(2): 274-282. LIU Ch A, FENG X L, SUN Ch H, et al. Maximum 2-D entropy image segmentation method based on improved sparrow algorithm[J]. Laser Technology, 2022, 46(2): 274-282(in Chinese).
[22]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//CVPR 2018: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2018: 7132-7141.
[23]	WANG W H, XIE E Z, SONG X G, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]//ICCV 2019: Proceedings of the IEEE/CVF International Conference on Computer Vision. New York, USA: IEEE, 2019: 8440-8449.

[1]	常颖 , 常大俊 . 改进型卷积神经网络焊点缺陷识别算法研究. 激光技术, 2020, 44(6): 779-783. doi: 10.7510/jgjs.issn.1001-3806.2020.06.023
[2]	黄威威 , 游德勇 , 高向东 , 张艳喜 , 黄宇辉 . 基于相关分析和神经网络的激光焊接稳态识别. 激光技术, 2022, 46(3): 312-319. doi: 10.7510/jgjs.issn.1001-3806.2022.03.004
[3]	雷翰林 , 张宝华 . 基于多模型深度卷积网络融合的人群计数算法. 激光技术, 2019, 43(4): 476-481. doi: 10.7510/jgjs.issn.1001-3806.2019.04.008
[4]	李宇豪 , 吕晓琪 , 谷宇 , 张明 , 李菁 . 基于改进S3FD网络的人脸检测算法. 激光技术, 2021, 45(6): 722-728. doi: 10.7510/jgjs.issn.1001-3806.2021.06.008
[5]	吴家洲 , 刘君 , 施佳文 , 张胜 . 激光焊缝图像分割与颜色识别方法研究. 激光技术, 2023, 47(5): 723-728. doi: 10.7510/jgjs.issn.1001-3806.2023.05.022
[6]	周永康 , 朱尤攀 , 曾邦泽 , 胡健钏 , 欧阳慧明 , 李泽民 . 宽动态红外图像增强算法综述. 激光技术, 2018, 42(5): 718-726. doi: 10.7510/jgjs.issn.1001-3806.2018.05.025
[7]	李泽峰 , 欧阳八生 . 基于MFC+HALCON图像识别Mark圆的检测方法. 激光技术, 2020, 44(3): 358-363. doi: 10.7510/jgjs.issn.1001-3806.2020.03.016
[8]	贺锋涛 , 吴倩倩 , 杨祎 , 张建磊 , 王炳辉 , 张依 . 基于深度学习的激光散斑图像识别技术研究. 激光技术, 2024, 48(3): 443-448. doi: 10.7510/jgjs.issn.1001-3806.2024.03.022
[9]	张凡 . 红外图像改进非局部均值滤波算法研究. 激光技术, 2015, 39(5): 662-665. doi: 10.7510/jgjs.issn.1001-3806.2015.05.016
[10]	李庆辉 , 李艾华 , 姜柯 , 赵少宁 . HIS空间的火灾图像模糊增强快速算法. 激光技术, 2014, 38(1): 137-140. doi: 10.7510/jgjs.issn.1001-3806.2014.01.030
[11]	张健 , 李白燕 . 基于图论最小割集算法的图像分割研究. 激光技术, 2014, 38(6): 863-866. doi: 10.7510/jgjs.issn.1001-3806.2014.06.030
[12]	陶昕辰 , 朱涛 , 黄玉玲 , 高恬曼 , 何博 , 吴迪 . 基于DDR GAN的低质量图像增强算法. 激光技术, 2023, 47(3): 322-328. doi: 10.7510/jgjs.issn.1001-3806.2023.03.006
[13]	田猛 , 高向东 , 谢岳轩 , 张艳喜 . 焊接缺陷磁光成像噪声特征分析及处理算法. 激光技术, 2023, 47(5): 646-652. doi: 10.7510/jgjs.issn.1001-3806.2023.05.011
[14]	王宏文 , 梁彦彦 , 王志华 . 基于新遗传算法的Otsu图像阈值分割方法. 激光技术, 2014, 38(3): 364-367. doi: 10.7510/jgjs.issn.1001-3806.2014.03.017
[15]	朱文艳 , 李莹 , 袁飞 , 冯少彤 , 聂守平 . 基于JPEG压缩编码的小波域多图像融合算法研究. 激光技术, 2014, 38(3): 425-430. doi: 10.7510/jgjs.issn.1001-3806.2014.03.031
[16]	张雷 , 罗长更 , 张颖颖 , 李根全 , 杨兴强 , 王肖霞 . 基于支持度变换的红外与可见光图像融合算法. 激光技术, 2015, 39(3): 428-431. doi: 10.7510/jgjs.issn.1001-3806.2015.03.032
[17]	李旭寒 , 董安国 , 封建湖 . 基于多级引导滤波器的图像区域融合算法. 激光技术, 2016, 40(5): 756-761. doi: 10.7510/jgjs.issn.1001-3806.2016.05.029
[18]	陈锋 , 张闻文 , 虞文俊 , 陈钱 , 顾国华 . 基于小波变换的EMCCD微光图像融合算法. 激光技术, 2014, 38(2): 155-160. doi: 10.7510/jgjs.issn.1001-3806.2014.02.003
[19]	向志聪 , 张程潇 , 白玉磊 , 赖文敬 , 王钦若 , 周延周 . 一种高分辨率3维图像的自适应降噪算法. 激光技术, 2015, 39(5): 697-701. doi: 10.7510/jgjs.issn.1001-3806.2015.05.024
[20]	梁智锦 , 王开福 , 顾国庆 , 张成斌 . 基于微粒子群优化算法的数字散斑图像相关方法. 激光技术, 2014, 38(5): 603-607. doi: 10.7510/jgjs.issn.1001-3806.2014.05.006

留言板