Frontiers in Signal Processing

Road Object Detection of YOLO Algorithm with Attention Mechanism

Download PDF (844.2 KB) PP. 9 - 16 Pub. Date: January 31, 2021

DOI: 10.22606/fsp.2021.51002

Author(s)

Jiacheng Li
College of Electrical Engineering, Southwest Minzu University, Chengdu, China
Huazhang Wang^*
College of Electrical Engineering, Southwest Minzu University, Chengdu, China
Yuan Xu
College of Electrical Engineering, Southwest Minzu University, Chengdu, China
Fan Liu
College of Electrical Engineering, Southwest Minzu University, Chengdu, China

Abstract

In auto-driving cars, incorrect object detection can lead to serious accidents, so high-precision object detection is the key to automatic driving. This paper improves on the YOLOv3 object detection algorithm, and introduces the channel attention mechanism and spatial attention mechanism into the feature extraction network, which is used to autonomously learn the weight of each channel, enhance key features, and suppress redundant features. Experimental results show that the detection effect of the improved network algorithm is significantly higher than that of the YOLOv3 algorithm.

Keywords

autonomous driving, object detection, YOLOv3, channel attention mechanism, spatial attention mechanism

References

[1] 1. REN S Q, HEK M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[C]// Proceedings of the 2017 International Conference on Pattern Analysis and Machine Intelligence, DC: IEEE Computer Society, 2017, 39(6):1137-1149

[2] 2. Wei Liu, Dragomir Anguelov, Dumitru Erhan, ChristianSzegedy, Scott Reed, Cheng-Yang Fu, and Alexander CBerg. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV), 2016: 21–37.

[3] 3. Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788

[4] 4. Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271

[5] 5. Redmon J, Farhadi A. Yolov3: An incremental improvement[J].arXiv preprint arXiv: 1804.02767, 2018.

[6] 6. HU J, SHEN L, SUN G.Squeeze-and-Excitation Networks[C]//Proceedings of the 2017 International Conference on Computer Vision and Pattern Recognition (CVPR), Piscataway: IEEE, 2018:7132-7141.

[7] 7. WOO S, PARK J, LeeJY, et al.CBAM: Convolutional Block Attention Module[C]// Proceedings of the 2017 European Conference on Computer Vision (ECCV), New York: ACM, 2018:3-19.

[8] 8. He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[9] 9. Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. IEEE, 2017:936-944.

[10] 10. Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[J]. arXiv preprint arXiv: 1502.03167, 2015.

[11] 11. He K, Zhang X, Ren S, et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification[C]// CVPR. IEEE Computer Society, 2015.

[12] 12. Lin M，ChenQ，YanS. Network in network[J].arXiv: 1312.4400，2013

[13] 13. Yu F, Chen H, Wang X, et al. BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning[C]//Proceedings of the 2017 International Conference on Computer Vision and Pattern Recognition (CVPR), 2018: 2633-2642

[14] 14. Everingham M, Gool L V, Williams C K I, et al. The Pascal Visual Object Classes (VOC) Challenge[J]. International Journal of Computer Vision, 2010, 88(2):303-338.

[15] 15. Wang C Y, Liao H Y M, Yeh I H, et al. CSPNet: A New Backbone that can Enhance Learning Capability of CNN[J]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 390-391

[16] 16. LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature Pyramid Networks for Object Detection[C]//Proceedings of the 2017 International Conference on Computer Vision and Pattern Recognition (CVPR), Piscataway: IEEE, 2017:2117-2125.

[17] 17. Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020.

[18] 18. Choi J, Chun D, Kim H, et al. Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving[C]// The IEEE International Conference on Computer Vision (ICCV), 2019