基于YOLOv5改进的红外目标检测算法

刘皓皎; 刘力双; 张明淳

doi:10.7510/jgjs.issn.1001-3806.2024.04.011

基于YOLOv5改进的红外目标检测算法

北京信息科技大学仪器科学与光电工程学院, 北京 100192, 中国

通讯作者: 刘力双, Liulishaung@bistu.edu.cn ;

基金项目:
光电信息控制和安全技术重点实验室基金资助项目 202105509
中图分类号: TN219;TP391

An improved infrared object detection algorithm based on YOLOv5

School of Instrument Science and Opto-electronics Engineering, Beijing Information Science and Technology University, Beijing 100192, China

Corresponding author: LIU Lishuang, Liulishaung@bistu.edu.cn ;

CLC number: TN219;TP391

摘要: 为了解决红外图像特征少、对比度不佳导致目标检测时精度低的问题，采用增加一个额外的预测特征层的方法，以提高原始YOLOv5在红外图像中的识别率；通过添加坐标注意力机制，优化红外目标强特征提取，提升检测准确度；再使用双向特征金字塔网络优化特征融合，增强模型表达能力，降低冗余计算；最后解决检测定位差和边界框回归任务中样本不平衡，采用focal-EIOU作为模型的边界框损失函数，提高收敛速度，并专注于高质量的锚框回归。结果表明，改进的YOLOv5在FLIR数据集上的准确率达到了85.3%，相比于原始网络模型提高了4.2%，具有较高的检测准确率。这一结果为在嵌入式设备上部署该软件提供了可行性。
- 图像处理 /
- 深度学习 /
- 红外目标检测 /
- 卷积神经网络 /
- 特征融合
Abstract: To address the issues of low recognition accuracy, lack of infrared image features, and poor contrast affecting object detection, several improvements to the original YOLOv5 model were proposed. Firstly, an additional prediction feature layer was introduced to enhance the detection capability for small objects in infrared images. Additionally, a coordinate attention mechanism was employed to enhance the extraction of strong features from infrared targets, thereby improving the detection accuracy of the model. Secondly, the feature fusion network was optimized by using a bidirectional feature pyramid network to improve the model's expressive power and reduce redundant computation. Lastly, to tackle the problem of sample imbalance in detection localization and bounding box regression tasks, the focal-EIOU as the loss function was adopted. This accelerates convergence speed and focuses the regression process on high-quality anchor boxes. Experimental results demonstrate that the improved YOLOv5 achieves an accuracy of 85.3% on the FLIR dataset, which is a 4.2% improvement over the original network model. It not only exhibits high detection accuracy but also provides feasibility for deployment on embedded devices.
- image processing /
- deep learning /
- infrared object detection /
- convolutional neural networks /
- feature fusion

图 1 YOLOv5s结构图

Figure 1. YOLOv5s structure diagram

下载: 全尺寸图片幻灯片

图 2 改进的YOLOv5s结构图

Figure 2. Improved YOLOv5s structure diagram

下载: 全尺寸图片幻灯片

图 3 CA机制

Figure 3. CA mechanism

下载: 全尺寸图片幻灯片

图 4 PAN和BiFPN示意图

Figure 4. Schematic diagram of PAN and BiFPN

下载: 全尺寸图片幻灯片

图 5 YOLOv5s和改进后的YOLOv5s检测结果对比

Figure 5. Comparison of YOLOv5s and improved YOLOv5s detection results

下载: 全尺寸图片幻灯片

表 1 训练平台配置

Table 1. Training platform configuration

name	configuration information
CPU(central processing unit)	Intel(R)Core i9-10900X
GPU(graphics processing unit)	NVIDIA RTX 3090 ×2
framework	Pytorch 1.12.1
environments	CUDA11.6 CUDNN8.3.2

下载: 导出CSV

表 2 改进的YOLOv5消融实验数据

Table 2. Improved Yolov5 ablation experimental data

model	+head	BiFPN	CA	EIOU	MAP/%
YOLOv5s					81.1
A	√				83.9
B	√	√			84.5
C	√	√	√		84.8
D	√	√	√	√	85.4

下载: 导出CSV

表 3 不同模型的检测性能对比

Table 3. Comparison of detection performance of different models

model	P/%	R/%	MAP/%	parameter/10⁶	size/Mbyte	speed/(frame·s^-1)	BFLOP
faster R-CNN	63.9	53.7	80.4	99.2	330.6	33	440.3
SSD	71.8	34.7	71.8	91.7	182.2	64	190.7
YOLOv3-tiny	72.1	52.4	58.9	8.6	17.4	205	12.9
YOLOv4	79.3	66.5	74.9	9.1	18.7	101	20.6
YOLOv5s	82.6	71.0	81.1	7.0	14.4	116	15.8
YOLOv5s-p2	85.3	72.8	81.9	7.1	15.5	113	18.6
our	86.9	74.4	85.3	7.2	15.8	106	19.0

下载: 导出CSV

表 4 不同尺寸的检测指标对比

Table 4. Comparison of detection indicators of different sizes

model	MAP/%
model	small	medium	large
YOLOv5s	71.3	95.2	94.4
our method	79.8	96.3	95.2

下载: 导出CSV

[1]	李其昌, 李兵伟, 王宏臣. 非制冷红外成像技术发展动态及其军事应用[J]. 军民两用技术与产品, 2016, 42(21): 54-57. doi: 10.3969/j.issn.1009-8119.2016.21.029 LI Q Ch, LI B W, WANG H Ch. Development trends and military applications of uncooled infrared imaging technology[J]. Dual Use Technologies & Products, 2016, 42(21): 54-57(in Chinese). doi: 10.3969/j.issn.1009-8119.2016.21.029
[2]	侯春萍, 张倩文, 王晓燕, 等. 轮廓匹配的复杂背景中目标检测算法[J]. 哈尔滨工业大学学报, 2020, 52(5): 121-128. HOU C P, ZHANG Q W, WANG X Y, et al. Object detection algorithm in complex background based on contour matching[J]. Journal of Harbin Institute of Technology, 2020, 52(5): 121-128(in Chinese).
[3]	BILAL M, HANIF M S. Benchmark revision for HOG-SVM pedestrian detector through reinvigorated training and evaluation methodologies[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 16(52): 1277-1287.
[4]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hie-rarchies for accurate object detection and semantic segmentation[C]// Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE Press, 2014: 277-127.
[5]	LI Y, PANG Y, CAO J, et al. Improving single shot object detection with feature scale unmixing[J]. IEEE Transactions on Image Processing, 2021, 30: 2708-2721. doi: 10.1109/TIP.2020.3048630
[6]	CHENG G, YUAN X, YAO X W, et al. Towards large-scale small object detection: Survey and benchmarks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 23(76): 34-46.
[7]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Press, 2016: 779-788.
[8]	张明淳, 牛春晖, 刘力双, 等. 用于无人机探测系统的红外小目标检测算法[J]. 激光技术, 2024, 48(1): 114-120. doi: 10.7510/jgjs.issn.1001-3806.2024.01.018 ZHANG M Ch, NIU Ch H, LIU L Sh, et al. Infrared small target detection algorithm for unmanned aerial vehicle detection system[J]. Laser Technology, 2024, 48(1): 114-120(in Chinese). doi: 10.7510/jgjs.issn.1001-3806.2024.01.018
[9]	王云杰, 王艳林, 夏润秋, 等. 大视场红外告警系统中目标高精度方位提取[J]. 激光技术, 2023, 47(2): 200-204. doi: 10.7510/jgjs.issn.1001-3806.2023.02.007 WANG Y J, WANG Y L, XIA R Q, et al. High precision azimuth extraction of targets in a large field of view infrared warning system[J]. Laser Technology, 2023, 47(2): 200-204(in Chinese). doi: 10.7510/jgjs.issn.1001-3806.2023.02.007
[10]	JIANG P, DAJI E, LIU F, et al. A review of YOLO algorithm deve-lopments[J]. Procedia Computer Science, 2022, 199: 1066-1073. doi: 10.1016/j.procs.2022.01.135
[11]	TERVEN R, CORDOVA-ESPARAZA D M. A comprehensive review of YOLO: From YOLOv1 to YOLOv8 and beyond[J]. arXiv Computer Science, 2023, 4: 2304.00501.
[12]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 75(23): 2004-10934.
[13]	ZHANG Y, GUO Zh Y, WU J Q, et al. Real-time vehicle detection based on improved YOLOv5[J]. Sustainability, 2022, 19: 12274-15427.
[14]	FANGBO Z, ZHAO H L, NIE Z. Safety helmet detection based on YOLOv5[J]. IEEE International Conference on Power Electronics, Computer Applications, 2021, 34(56): 6-11.
[15]	ZHU X K, LYU Sh Ch, WANG X, et al. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//International Conference on Computer Vision. Québec, Canada: IEEE Press, 2021: 11539.
[16]	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE Press, 2021: 13731-13722.
[17]	WOO S H, PARK J C, LEE J Y, et al. CBAM: Convolutional block attention module[C]//European Conference on Computer Vision. Munich, Germany: Springer Science Press, 2018: 3-9.
[18]	HU J, LI S, SUN G. Squeeze-and-excitation networks[C]//Confe-rence on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Press, 2018: 7132-7141.
[19]	TAN M X, PANG R M, LE Q V. Efficientdet: Scalable and efficient object detection[C]//Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE Press, 2020: 10781-10790.
[20]	ZHANG Y F, REN W Q, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157.
[21]	陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测[J]. 光电工程, 2022, 49(3): 210372. CHEN X, PENG D L, GU Y. Real-time objeet detection for UAV images based on improved YOLOv5s[J]. Opto-Electronic Engineering, 2022, 49(3): 210372(in Chinese).

[1]	常颖 , 常大俊 . 改进型卷积神经网络焊点缺陷识别算法研究. 激光技术, 2020, 44(6): 779-783. doi: 10.7510/jgjs.issn.1001-3806.2020.06.023
[2]	张艳月 , 张宝华 , 赵云飞 , 吕晓琪 , 谷宇 , 李建军 . 基于双通道深度密集特征融合的遥感影像分类. 激光技术, 2021, 45(1): 73-79. doi: 10.7510/jgjs.issn.1001-3806.2021.01.013
[3]	杜永兴 , 苗晓伟 , 秦岭 , 李宝山 . 基于VDNet卷积神经网络的羊群计数. 激光技术, 2021, 45(5): 675-680. doi: 10.7510/jgjs.issn.1001-3806.2021.05.023
[4]	冯玮 , 王玉德 , 张磊 . 加权联合降维的深度特征提取与分类识别算法. 激光技术, 2018, 42(5): 666-672. doi: 10.7510/jgjs.issn.1001-3806.2018.05.016
[5]	贺锋涛 , 吴倩倩 , 杨祎 , 张建磊 , 王炳辉 , 张依 . 基于深度学习的激光散斑图像识别技术研究. 激光技术, 2024, 48(3): 443-448. doi: 10.7510/jgjs.issn.1001-3806.2024.03.022
[6]	李宇豪 , 吕晓琪 , 谷宇 , 张明 , 李菁 . 基于改进S3FD网络的人脸检测算法. 激光技术, 2021, 45(6): 722-728. doi: 10.7510/jgjs.issn.1001-3806.2021.06.008
[7]	马飞 , 王梓璇 , 刘思雨 . 基于深度图像先验的高光谱图像去噪方法. 激光技术, 2024, 48(3): 379-386. doi: 10.7510/jgjs.issn.1001-3806.2024.03.013
[8]	陶昕辰 , 朱涛 , 黄玉玲 , 高恬曼 , 何博 , 吴迪 . 基于DDR GAN的低质量图像增强算法. 激光技术, 2023, 47(3): 322-328. doi: 10.7510/jgjs.issn.1001-3806.2023.03.006
[9]	吴家洲 , 刘君 , 施佳文 , 张胜 . 激光焊缝图像分割与颜色识别方法研究. 激光技术, 2023, 47(5): 723-728. doi: 10.7510/jgjs.issn.1001-3806.2023.05.022
[10]	陈树越 , 刘金星 , 丁艺 . 基于小波变换的红外与X光图像融合方法研究. 激光技术, 2015, 39(5): 685-688. doi: 10.7510/jgjs.issn.1001-3806.2015.05.021
[11]	陆宝红 , 宋雪桦 . 基于历史信息的区域卷积神经网络行人检测. 激光技术, 2019, 43(5): 660-665. doi: 10.7510/jgjs.issn.1001-3806.2019.05.014
[12]	雷翰林 , 张宝华 . 基于多模型深度卷积网络融合的人群计数算法. 激光技术, 2019, 43(4): 476-481. doi: 10.7510/jgjs.issn.1001-3806.2019.04.008
[13]	钱绣洁 , 陈瀚林 , 马海霞 , 杨雁南 , 蓝建宇 . 基于深度学习的激光无线充电保护系统. 激光技术, 2024, 48(5): 765-770. doi: 10.7510/jgjs.issn.1001-3806.2024.05.023
[14]	张雷 , 罗长更 , 张颖颖 , 李根全 , 杨兴强 , 王肖霞 . 基于支持度变换的红外与可见光图像融合算法. 激光技术, 2015, 39(3): 428-431. doi: 10.7510/jgjs.issn.1001-3806.2015.03.032
[15]	葛雯 , 姬鹏冲 , 赵天臣 . NSST域模糊逻辑的红外与可见光图像融合. 激光技术, 2016, 40(6): 892-896. doi: 10.7510/jgjs.issn.1001-3806.2016.06.024
[16]	虞文俊 , 顾国华 , 杨蔚 . 基于小波变换的红外偏振图像融合算法. 激光技术, 2013, 37(3): 289-292. doi: 10.7510/jgjs.issn.1001-3806.2013.03.004
[17]	魏艳平 . 线性变换与局部均衡融合的红外图像增强. 激光技术, 2024, 48(5): 705-710. doi: 10.7510/jgjs.issn.1001-3806.2024.05.014
[18]	李志国 , 张思将 , 周建忠 . 基于图像特征的红外对抗干扰效果评估方法研究. 激光技术, 2013, 37(3): 413-416. doi: 10.7510/jgjs.issn.1001-3806.2013.03.032
[19]	李文龙 , 戈海龙 , 任远 , 成巍 . 图像处理技术在激光熔池温度检测的应用. 激光技术, 2018, 42(5): 599-604. doi: 10.7510/jgjs.issn.1001-3806.2018.05.004
[20]	李婕 , 李青清 , 李礼 , 刘钊 , 沈阳 , 涂静敏 . 基于深度学习的机载点云屋顶平面提取算法. 激光技术, 2024, 48(5): 628-636. doi: 10.7510/jgjs.issn.1001-3806.2024.05.003

点击查看大图

图(5) / 表(4)

计量

文章访问数: 1117
HTML全文浏览量: 567
PDF下载量: 14
被引次数: 0

全文HTML

0. 引言

目标识别技术在计算机视觉领域扮演关键角色，这项技术不仅提升了智能监控系统的性能，还推动了自动驾驶车辆的创新发展，同时在工业和智能家庭领域中发挥着实际作用。红外图像由于具有穿透迷雾、烟雾和低光环境的特点，在军事和安全侦察^[1]以及监测方面非常有用。尽管在具体场景下，红外图像拥有较长的红外波长，但在相同成像条件下，红外图像的空间分辨率通常较低。与可见光图像不同，红外图像是通过探测器捕捉物体表面的热辐射来完成成像的。这种热辐射受到外部环境和气候等多种因素的影响，目标的探测和识别在这种情境下变得更加具有挑战性。

目标检测算法可分为传统和机器学习两大类。主流的传统方法在红外图像目标检测中运用了边缘检测、模板匹配以及霍夫变换等相关技术。一些算法利用边缘、轮廓^[2]和纹理进行目标检测，例如使用方向梯度直方图(histogram of oriented gradient，HOG)^[3]特征进行行人检测。然而，这些传统方法需要手动提取图像特征，依赖先验知识且表达能力有限，从而限制了准确性。

近些年，将深度学习应用于目标检测算法中已取得显著的成果。2014年，GIRSHICK等人提出了第1个基于深度学习的目标检测算法区域卷积神经网络(regional convolutional neural network，R-CNN)^[4]，该算法将图像分割成多个感兴趣区域(region of interest，ROI)，对每个ROI执行卷积神经网络，并使用支持向量机(support vector machine，SVM)进行分类。为了提高速度，2015年，GIRSHICK等人提出了fast R-CNN算法^[5]，在整个图像上执行卷积神经网络，然后使用ROI池化对每个感兴趣的区域进行处理。相比之前的两阶段目标检测算法，一阶段方法如单次多边框检测(single shot multibox detector, SSD)和RetinaNet, 在速度上有较大提升^[6]。2016年，REDMON等人提出了你只需看一次(you only look once, YOLO)算法^[7]，利用单个卷积神经网络同时进行边界框和类别的预测，对整个图像进行处理，实时运行。尽管初始版本的YOLO在准确率和对小目标的检测能力上存在一些不足^[8]，但经过改进，目前在工业领域得到广泛应用^[9]。

与基于候选区域的方法相比，YOLOv1利用全局图像信息来预测边界框和类别^[10]，消除了候选区域的问题。同时，YOLOv1采用单一网络，在端到端的训练中同时学习检测和分类任务。YOLOv2引入了批量归一化来减少协变量的偏移^[11]，并使用锚框来处理不同大小的目标，它采用高分辨率的特征图进行检测，以最小化物体检测定位误差。YOLOv3使用特征金字塔网络(feature pyramid networks, FPN)来实现多尺度目标检测，FPN通过顶层向底层进行边界融合, 构建一个由大到小尺度的特征金字塔；基于这个特征金字塔, YOLOv3采用统一大小的全卷积网络头进行预测, 以检测不同尺寸的目标，这种设计融合了不同语义级别的特征, 增强了对小目标的检测效果。此外，YOLOv3对训练数据进行了增强，如随机旋转、裁剪等操作，以更好地适应不同的场景。YOLOv4使用了空间金字塔池化和路径聚合网络等技术来实现多尺度特征融合^[12]，并通过替换原来的损失函数为Mish激活函数，YOLOv4增强模型的非线性化能力和稳定性水平，引入了一系列改进策略，包括在输入端应用Mosaic数据增强技术和DropBlock正则化等。

本文作者希望是在红外图像对比度低、成像模糊，且目标尺寸小的情况下，通过深度学习方法实现高性能的红外目标技术同时控制模型参数，以适应嵌入式设备的部署。为了实现这一目标，在YOLOv5s的基础上，采用迁移学习进行训练，并通过引入坐标注意力(coordinate attention，CA)机制模块、添加额外的预测层来提升小目标的检测能力，同时修改特征融合网络和损失函数。

4. 结论

本文作者针对红外图像特征少、尺度变化大的问题^[21]，提出了一种改进的YOLOv5s网络。首先增加了用于小目标的额外检测层，用于提高模型对红外小目标的检测能力；在主干网络增加了CA机制模块，让模型更好地关注重要的特征，抑制一些无关紧要的通道；然后使用BiFPBN替换原始YOLOv5s中使用的特征融合网络，减少冗余计算，大大提升模型的计算效率；最后使用了focal-EIOU损失函数。改进后的模型在红外图像数据集上测试，MAP相比于faster R-CNN、SSD、YOLOv5s以及YOLOv5s-p2分别提高了4.9%、13.5%、4.2%和3.4%。本文作者提出的红外目标检测算法，具有较强的实用价值和鲁棒性。

参考文献 (21)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于YOLOv5改进的红外目标检测算法

通讯作者: 刘力双, Liulishaung@bistu.edu.cn ;

An improved infrared object detection algorithm based on YOLOv5

Corresponding author: LIU Lishuang, Liulishaung@bistu.edu.cn ;

计量

基于YOLOv5改进的红外目标检测算法

通讯作者: 刘力双, Liulishaung@bistu.edu.cn;

English Abstract

An improved infrared object detection algorithm based on YOLOv5

Corresponding author: LIU Lishuang, Liulishaung@bistu.edu.cn

全文HTML

2.1. 增加额外的预测层

2.2. 坐标注意力机制

2.3. 双向特征金字塔网络

2.4. 损失函数的改进

3.1. 实验平台与参数

3.2. 数据集

3.3. 评估指标

3.4. 相融实验

3.5. 不同目标检测算法的结果比较

目录

留言板

基于YOLOv5改进的红外目标检测算法

通讯作者: 刘力双, Liulishaung@bistu.edu.cn ;

An improved infrared object detection algorithm based on YOLOv5

Corresponding author: LIU Lishuang, Liulishaung@bistu.edu.cn ;

计量

出版历程

基于YOLOv5改进的红外目标检测算法

通讯作者: 刘力双, Liulishaung@bistu.edu.cn;

English Abstract

An improved infrared object detection algorithm based on YOLOv5

Corresponding author: LIU Lishuang, Liulishaung@bistu.edu.cn

全文HTML

2.1. 增加额外的预测层

2.2. 坐标注意力机制

2.3. 双向特征金字塔网络

2.4. 损失函数的改进

3.1. 实验平台与参数

3.2. 数据集

3.3. 评估指标

3.4. 相融实验

3.5. 不同目标检测算法的结果比较

目录