基于历史信息的区域卷积神经网络行人检测

陆宝红; 宋雪桦

doi:10.7510/jgjs.issn.1001-3806.2019.05.014

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名

邮箱

手机号码

标题

留言内容

验证码

基于历史信息的区域卷积神经网络行人检测

江苏大学计算机科学与通信工程学院通信工程系, 镇江 212013

作者简介: 陆宝红(1991-), 女, 硕士研究生, 从事图像处理及其应用方面的研究.

通讯作者: 宋雪桦, songxh@ujs.edu.cn

基金项目:

国家重点研发计划资助项目 2017YFC1600800

中图分类号: TN911.73

Continuous pedestrian detection by means of regional convolutional neural network based on historical information

Department of Electronics and Communication Engineering, School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China

Corresponding author: SONG Xuehua, songxh@ujs.edu.cn

CLC number: TN911.73

摘要: 为了解决卷积神经网络在进行连续行人检测时, 检测行人速度较慢, 达不到实时性要求的问题, 采用基于历史信息的区域卷积神经网络行人检测算法, 利用前一幅图像中的检测结果对当前图像的检测过程进行优化, 将前一帧的检测结果作为对当前帧提取推荐区域的参考信息, 并使用当前帧与前一帧的灰度值差异图对当前图像的卷积特征进行过滤, 以缩小滑动窗口检测时的搜索区域。在加州理工学院行人检测数据集上进行了检测实验。结果表明, 结合历史信息的算法与先进的算法相比检测速度提升了2.5倍, 同时检测准确率提升了1.5%。该算法实现了实时行人检测, 设计的网络能有效检测小目标行人。

Abstract: In order to solve the problem that convolutional neural network detection of pedestrians was slow, and did not meet the real-time requirement when performing continuous pedestrian detection, pedestrian detection algorithm of history-based region with convolutional neural network was used. Current image was detected by using the detection result in the previous image. The detection process was optimized, and the detection result of the previous image was used as reference information for extracting region proposals of the current image. Convolution feature of the current image was filtered by using the gray value difference map of the current image and the previous image to reduce the sliding window searching area. The results of Caltech pedestrian detection data set show that the algorithm combined with historical information is 2.5 times faster than the advanced algorithm, and the detection accuracy is increased by 1.5%. The algorithm implements real-time pedestrian detection, and the designed network can effectively detect small target pedestrians.

Key words:

image processing /
continuous pedestrian detection /
historical information /
regions with convolutional neural network /
region proposal

recall

recommendation algorithm

selective search

edge-boxes

RPN

HRPN

50%

300

100

75%

1400

800

250

R-CNN

fast R-CNN

faster R-CNN

adapted faster R-CNN

faster R-CNN with HRPN

HR-CNN

time

47s

198ms

99ms

64ms

39ms

MAP/%

53.3%

65.7

73.2

88.7

90.2

基于历史信息的区域卷积神经网络行人检测

通讯作者: 宋雪桦, songxh@ujs.edu.cn

作者简介: 陆宝红(1991-), 女, 硕士研究生, 从事图像处理及其应用方面的研究

江苏大学计算机科学与通信工程学院通信工程系, 镇江 212013

收稿日期: 2018-11-05

录用日期: 2019-02-25

网络出版日期: 2019-09-25

基金项目: 国家重点研发计划资助项目 2017YFC1600800

关键词:

全文HTML

引言

行人检测具有重要的应用价值，因此获得了工业和学术界大量研究人员的关注，成为目标检测中的一个研究热点。KRIZHEVSKY等人^[1]采用卷积神经网络赢得了ImageNet图像分类比赛的冠军，此后基于卷积神经网络的目标检测算法成为主流的研究方法^[2]，并长期在COCO检测数据集^[3]、PASCAL VOC检测数据集^[4]和ILSVRC^[5]等目标检测数据集的检测结果排名中占据领先地位。行人检测中的手工设计特征仍然扮演着重要作用^[6-7]。目前使用手工设计特征或将手工设计的特征和深度学习获得的特征结合起来的混合方式是一个重要的研究方向。如参考文献[8]中在色彩空间LUV+方向梯度直方图HOG通道上通过滑动窗口的行人检测技术；参考文献[9]使用基于矩形通道特征的行人检测器产生推荐区域，并通过区域推荐神经网络^[10](regions with convolutional neural network，R-CNN)对推荐区域进行分类。

faster R-CNN^[11]是当前目标检测中比较成功的方法之一，参考文献[12]~参考文献[14]中对faster R-CNN在行人检测中的应用进行了研究。ZHANG等人^[12]的实验结果显示，faster R-CNN中的区域推荐网络（region proposal network，RPN）产生的推荐区域与主流行人检测器表现相当，但由于分类器卷积特征图的分辨率较低导致检测准确率下降。ZHANG等人^[13]通过引入额外的训练集CityPersons对faster R-CNN进行训练，使得faster R-CNN在CALTECH行人检测数据集的检测结果达到了与最先进方法相当的水平，并且目标位置标注也更准确。

上述研究表明，将在一般目标检测中获得成功的区域卷积神经网络用于行人检测同样适用。从R-CNN, fast R-CNN^[15]到faster R-CNN，越来越多的计算在检测过程中实现共享，检测速度提升了200多倍，但仍未能达到实际应用中的实时性要求。与一般目标检测不同的是，在自动驾驶和视频监控中检测行人是一个连续的过程，如果对当前帧进行检测时能充分利用前一帧的检测结果，就可以减少计算量，从而提升检测速度。基于这一思路，作者提出了基于历史信息的区域推荐卷积神经网络（history-based region with convolutional neural network，HR-CNN）行人检测算法。HR-CNN对faster R-CNN中的RPN网络进行优化，设计了基于历史信息的区域推荐网络（history-based region proposal network，HRPN）。此外，作者使用Inception V4^[16]代替faster R-CNN中的VGG(实验室名称)网络，有效提升图像中小目标行人的检测能力。

3. 结论

基于行人检测在多数应用场景下需要连续检测的事实，在faster R-CNN的基础上设计了HR-CNN行人检测网络，提出了HRPN区域推荐网络。该区域推荐网络利用前一帧的检测结果作为当前推荐区域产生的参考信息，有效提升了推荐区域的准确率，同时在推荐区域相对较少的情况下获得较高的召回率，进而降低分类器的计算开销，提升了整个网络检测行人的速度。在Caltech数据集上的实验表明，HR-CNN在检测速率获得有效提升的同时，其检测结果的准确率也略优于adapted faster R-CNN等方案。

参考文献 (21)

[1]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. Lake Tahoe, Nevada, USA: Curran Associates Inc, 2012: 1097-1105.
[2]	FENG W, WANG Y D, ZHANG L. Weighted joint dimensionality extraction and classification recognition algorithm [J]. Laser Technology, 2018, 42(5): 666-672(in Chinese).
[3]	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]//European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 740-755.
[4]	EVERINGHAM M, Van GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (voc) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338. doi: 10.1007/s11263-009-0275-4
[5]	RUSSAKOVSKY O, DENG J, SU H, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-y
[6]	BENENSON R, OMRAN M, HOSANG J, et al. Ten years of pedestrian detection, what have we learned[C]//European Conference on Computer Vision. Zurich, Switzerland : Springer, 2014: 613-627.
[7]	ZHANG S, BENENSON R, OMRAN M, et al. How far are we from solving pedestrian detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2016: 1259-1267.
[8]	COSTEA A D, NEDEVSCHI S. Semantic channels for fast pedestrian detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2016: 2360-2368.
[9]	HOSANG J, OMRAN M, BENENSON R, et al. Taking a deeper look at pedestrians[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2015: 4073-4082.
[10]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2014: 580-587.
[11]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[12]	ZHANG L, LIN L, LIANG X, et al. Is faster R-CNN doing well for pedestrian detection[C]//European Conference on computer Vision. Amsterdam, Netherlands: Springer, 2016: 443-457.
[13]	ZHANG Sh Sh, BENENSON R, SCHIELE B. Citypersons: A diverse dataset for pedestrian detection[C]//The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York, USA: IEEE, 2017: 3.
[14]	LI J, LIANG X, SHEN S M, et al. Scale-aware fast R-CNN for pedestrian detection[J]. IEEE Transactions on Multimedia, 2018, 20(4): 985-996.
[15]	GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. New York, USA: IEEE, 2015: 1440-1448.
[16]	SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-V4, inception-resnet and the impact of residual connections on learning[J].Computer Vison and Pattern Recognition, 2016, 23(2):07261.
[17]	DOLLÁR P, WOJEK C, SCHIELE B, et al. Pedestrian detection: A benchmark[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2009. New York, USA: IEEE, 2009: 304-311.
[18]	GUADARRAMA S. Tensorflow-slim image classification model library[OL].(2018-07-05)[2018-12-31].https://github.com/tensorflow/models/tree/master/research/slim.
[19]	UIJLINGS J R R, van de SANDE K E A, GEVERS T, et al. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2): 154-171. doi: 10.1007/s11263-013-0620-5
[20]	ZITNICK C L, DOLLÁR P. Edge boxes: Locating object proposals from edges[C]//European Conference on Computer Vision. Zurich, Switzerland : Springer, 2014: 391-405.
[21]	KONG T, YAO A, CHEN Y, et al. Hypernet: Towards accurate region proposal generation and joint object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2016: 845-853.

[1]	李旭寒 , 董安国 , 封建湖 . 基于多级引导滤波器的图像区域融合算法. 激光技术, 2016, 40(5): 756-761. doi: 10.7510/jgjs.issn.1001-3806.2016.05.029
[2]	郑伟 , 孙雪青 , 李哲 . shearlet变换和区域特性相结合的图像融合. 激光技术, 2015, 39(1): 50-56. doi: 10.7510/jgjs.issn.1001-3806.2015.01.010
[3]	江天 , 沈会良 , 杨冬晓 , 刘建军 , 邹哲 . 基于模糊局部信息C均值的太赫兹图像目标检测. 激光技术, 2015, 39(3): 289-294. doi: 10.7510/jgjs.issn.1001-3806.2015.03.001
[4]	李文龙 , 戈海龙 , 任远 , 成巍 . 图像处理技术在激光熔池温度检测的应用. 激光技术, 2018, 42(5): 599-604. doi: 10.7510/jgjs.issn.1001-3806.2018.05.004
[5]	张宝华 , 刘鹤 . 基于区域定位与轮廓分割的红外目标检测. 激光技术, 2015, 39(6): 840-844. doi: 10.7510/jgjs.issn.1001-3806.2015.06.023
[6]	刘逸飞 , 苏亚 , 姚晓天 , 崔省伟 , 杨丽君 , 周聪聪 , 何松 . OCT无创血糖检测图像处理最优化方法研究. 激光技术, 2023, 47(2): 178-184. doi: 10.7510/jgjs.issn.1001-3806.2023.02.004
[7]	杜永兴 , 苗晓伟 , 秦岭 , 李宝山 . 基于VDNet卷积神经网络的羊群计数. 激光技术, 2021, 45(5): 675-680. doi: 10.7510/jgjs.issn.1001-3806.2021.05.023
[8]	王国军 , 黄亚新 , 赵启林 , 张冬冬 . 基于自适应区域的光斑中心鲁棒性研究. 激光技术, 2020, 44(5): 616-622. doi: 10.7510/jgjs.issn.1001-3806.2020.05.015
[9]	常颖 , 常大俊 . 改进型卷积神经网络焊点缺陷识别算法研究. 激光技术, 2020, 44(6): 779-783. doi: 10.7510/jgjs.issn.1001-3806.2020.06.023
[10]	孙越娇 , 雷武虎 , 胡以华 , 赵楠翔 , 任晓东 . 基于视觉显著模型的遥感图像舰船快速检测. 激光技术, 2018, 42(3): 379-384. doi: 10.7510/jgjs.issn.1001-3806.2018.03.017
[11]	李昌海 , 叶玉堂 , 沈淦松 , 徐伟 , 叶涵 , 姚景昭 . 基于图像轮廓分析的LCD线路缺陷检测. 激光技术, 2013, 37(2): 207-210. doi: 10.7510/jgjs.issn.1001-3806.2013.02.017
[12]	李泽峰 , 欧阳八生 . 基于MFC+HALCON图像识别Mark圆的检测方法. 激光技术, 2020, 44(3): 358-363. doi: 10.7510/jgjs.issn.1001-3806.2020.03.016
[13]	刘煊 , 渠慎明 . 低秩稀疏和改进SAM的高光谱图像误标签检测. 激光技术, 2022, 46(6): 808-816. doi: 10.7510/jgjs.issn.1001-3806.2022.06.016
[14]	李宇豪 , 吕晓琪 , 谷宇 , 张明 , 李菁 . 基于改进S3FD网络的人脸检测算法. 激光技术, 2021, 45(6): 722-728. doi: 10.7510/jgjs.issn.1001-3806.2021.06.008
[15]	黄威威 , 游德勇 , 高向东 , 张艳喜 , 黄宇辉 . 基于相关分析和神经网络的激光焊接稳态识别. 激光技术, 2022, 46(3): 312-319. doi: 10.7510/jgjs.issn.1001-3806.2022.03.004
[16]	张海庄 , 姚梅 , 雷萍 , 李鹏 , 曾庆平 . 远场激光光斑图像处理方法研究. 激光技术, 2013, 37(4): 460-463. doi: 10.7510/jgjs.issn.1001-3806.2013.04.010
[17]	雷翰林 , 张宝华 . 基于多模型深度卷积网络融合的人群计数算法. 激光技术, 2019, 43(4): 476-481. doi: 10.7510/jgjs.issn.1001-3806.2019.04.008
[18]	汤敏 , 王惠南 . 激光扫描共聚焦显微镜图像的计算机处理. 激光技术, 2007, 31(5): 558-560.
[19]	张羽鹏 , 王开福 . LabVIEW和MATLAB在电子散斑干涉图像处理中的应用. 激光技术, 2009, 33(6): 582-585,589. doi: 10.3969/j.issn.1001-3806.2009.06.007
[20]	冯煦 , 张瑞瑛 , 周萍 , 李松 . 大功率半导体线激光图像处理方法研究. 激光技术, 2010, 34(5): 624-627. doi: 10.3969/j.issn.1001-3806.2010.O5.013

留言板