論文の公開元へ

書き出し

Refer/BibIX

RIS

BibTeX

TSV

Vehicle Detection based on Spatial Saliency and Local Image Features in H.265 (HEVC) 4K Video and Evaluation Model for Quality of Detection Accuracy

AKTAR MOST. SHELINA 富山大学

2020.03.24

概要

In order to realize a safe and secure road transportation system, research on intelligent transportation systems (ITS) is widely conducted. In the optimization and management of traffic, technology for detecting vehicles is important, and research on detecting objects using information obtained from images, still images, and sensors has been widely conducted. In this study, one of the main challenges is to develop vehicle detection. Most of the existing visual saliency models use the input images, in which salient objects are to be detected, are free from complex background and overlapping areas. Moreover, they are very sensitive to the complex scene and different illuminations. They cannot detect their interest objects from the input video. This study develops a vehicle detection method by using spatial saliency and local image features. The Scale Invariant Feature Transform (SIFT) and Harris features in combination with spatial saliency model play an important role to detect vehicle from the scene. One-to-one symmetric search is performed on the descriptors to select a set of matched interest point pairs for vehicle detection. The one- to-one symmetric search on the descriptors is useful for detection of the interest object in the context of saliency detection. We use 4K video of a road scene with different types of　vehicles. The propose method is able to detect desired overlapping objects from the road scene without heavy computation like other training based methods. In the second, the detection performance is analyzed with another saliency based methods. Our methods have better performance as compared to the other conventional methods.

In the images/videos based applications over internet are typically stored in the compressed domain such as MPEG2, H.264, MPEG4, since they can reduce the storage space and greatly increase the delivering speed for Internet users. Most of the systems require transmission of data to some central server and have to deal with some issues such as limited bandwidth and quality. Consequently, they require to transmit videos with a reasonable high-quality in compressed domain for further processing by vision-based systems, such as person identification, fraud detection, and vehicle detection for road monitoring. Furthermore, existing saliency detection models are implemented in uncompressed domain and lack of analysis their performance. Therefore, there still have challenging research issues to detect interest objects with the conventional saliency based methods, and determined the reasonable high-quality video in compressed domain. From these contexts, we analyze the proposed detection method in compressed domain, and it shows better result in compared with conventional methods and single feature based detection.

During vehicle detection, it is necessary to know the correct vehicle position considered as “ground truth” in order to evaluate the　vehicle detection method. For this reason, many detection models define areas of the targeted object, where people considered areas of the objects. In many studies, the ground truth is represented by a rectangle. We consider the relationship between Intersection over Union (IoU) and subjective vehicle detection by considering shifted from the ground truth position. In this study, subjective evaluation experiments have been carried out with respect to misalignment from ground truth in vehicle detection. We also investigate subjective evaluation model with respect far and near view in vehicle detection. Based on the experimental results, we see that there is a significant difference in left and right misalignment even if the Intersection over Union (IoU) value was the same. Finally, we propose indices considering subjective evaluation model in vehicle detection utilizing IoU.

論文の公開元へ

この論文で使われている画像

参考文献

[1] G. Bhosle, et al., “Vehicle Tracking Using Image Processing,” IJRASET, vol. 6, no. 1, pp. 1235–38, 2018.

[2]. Ji, X.; Wei, Z.; and Feng, Y. (2006). Effective vehicle detection techniques for traffic surveillance systems. Journal of Visual　Communication and Image Representation, 17(3), 647-658.

[3]. Lozano, A.; Manfredi, G.; and Nieddu, L. (2009). An algorithm for the recognition of levels of congestion in road traffic problems. Mathematics and Computers in Simulation, 79(6), 1926-1934.

[4]. Zhou, J.; Gao, D.; and Zhang, D. (2007). Moving vehicle detection for automatic traffic monitoring. IEEE Transactions on Vehicle Technology, 56(1), 51- 59.

[5]. Niu, X. (2006). A semi-automatic framework for highway extraction and vehicle detection based on a geometric deformable model. ISPRS Journal of Photography and Remote Sensing, 61(3-4), 170-186.

[6]. Zhang, W.; Fang, X.Z.; and Yang, X. (2006). Moving vehicles segmentation based on Bayesian framework for Gaussian motion model. Pattern Recognition Letters, 27(1), 956-967.

[7]. Li, X.; Liu, Z.Q.; and Leung, K.M. (2002). Detection of vehicles from traffic scenes using fuzzy integrals. Pattern Recognition, 35(4), 967-980.

[8] H. Chung-Lin and L. Wen-Chieh, "A vision-based vehicle identification system," in Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, 2004, pp. 364- 367 Vol.4.

[9] Z. Wei, et al., "Multilevel Framework to Detect and Handle Vehicle Occlusion," Intelligent Transportation Systems, IEEE Transactions on, vol. 9, pp. 161-174, 2008.

[10] N. K. Kanhere and S. T. Birchfield, "Real-Time Incremental Segmentation and Tracking of Vehicles at Low Camera Angles Using Stable Features," Intelligent Transportation Systems, IEEE Transactions on, vol. 9, pp. 148-160, 2008.

[11] N. K. Kanhere, "Vision-based detection, tracking and classification of vehicles using stable features with automatic camera calibration," ed, 2008, p. 105.

[12] Itti L, Koch C, Niebur E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259.

[13] Rahtu E, Kannala J., Salo M. and Heikkil J (2010) Segmenting salient objects from images and Videos. Proc. European Conference on Computer Vision (ECCV 2010).

[14] T. N. Vikram, M. Tscherepanow, and B. Wrede, A saliency map based on sampling an image into random rectangular regions of interest, Pattern Recognition, vol. 45, issue: 9, pp. 3114-3124, 2012.

[15] N. Imamoglu,W. Lin and Y. Fang, ”A Saliency Detection Model Using Low-Level Features Based on Wavelet Transform,” in IEEE Transactions on Multimedia, vol. 15, no. 1, pp. 96-105, Jan. 2013.

[16] Weining Wang, Dong Cai, Xiangmin Xu, Alan Wee- Chung Liew, Visual saliency detection based on region descriptors and prior knowledge, Signal Process.: Image Communication, 29 (3) (2014) 424433.

[17] D.B. Walther, C. Koch, Attention in hierarchical models of object recognition, Progress in Brain Research 165 (2007).

[18] C. Guo, L. Zhang, A novel multiresolution spatiotemporal saliency detection model and its applications n image and video compression, IEEE Transactions on Image Processing 19 (2010) pp. 185-198.

[19] N.D.B. Bruce, J.K. Tsotsos, Saliency based on information maximization, Advances in Neural Information Processing Systems (2005) pp. 155-162.

[20] D. Gao, V. Mahadevan, N. Vasconcelos, The discriminant center-surround hypothesis for bottom-up saliency, Advances in Neural Information Processing Systems (2008) pp. 497-504.

[21] W. Kienzle, F.A. Wichmann, B. Scholkopf, M.O. Franz, A nonparametric approach to bottom-up visual saliency, Advances in Neural Information Processing Systems (2007) pp. 689-696.

[22] H. J. Seo, and P. Milanfar, Visual saliency for automatic target detection, boundary detection, and image quality assessment, IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 5578-5581, 2010.

[23] J. Harel, C. Koch, P. Perona, Graph-based visual saliency, Advances in Neural Information Processing Systems (2007) pp.545–552.

[24] Ming-Ming Cheng, Niloy J Mitra, Xiaolei Huang, Philip HS Torr, and Shi-Min Hu. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3):569-582, 2015.

[25] Ali Borji, Ming-Ming Cheng, Huaizu Jiang, and Jia Li. Salient object detection: A benchmark. IEEE transactions on image processing, 24(12):5706-5722, 2015.

[26] Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the kitti vision benchmark suite. pp. 3354-3361, 2012.

[27] Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. Vision meets robotics: The kitti dataset. The International Journal of RoboticsResearch, 32(11):1231- 1237, 2013.

[28] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213-3223, 2016.

[29] Mark Everingham, SM Ali Eslami, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes challenge: A retrospective. International journal of computer vision, 111(1):98-136, 2015.

[30] Koch, C. and Ullman, S. Shifts in selective visual attention: Towards the underlying neural circuitry. Hum. Neurobiol. 1985, 4, 219–227.

[31] Niebur E, Koch C. 1996. Control of selective visual attention: modeling the ‘where’ pathway. In Advances in Neural Information Processing Systems 8 (NIPS 1995). Advances in Neural Information Processing Systems, no. 8. pp. 802–808. Cambridge, MA: MIT Press.

[32] W. Reichardt, “Evaluation of optical motion information by movement detectors,” Journal of Comparative Physiology A, 161(4), pp. 533–547, 1987.

[33] T. Tuytelaars, and K. Mikolajczyk, “Local invariant feature detectors: A survey,” Foundations and Trends in Computer Graphics and Vision, 3 (3), pp. 177-280, 2008.

[34] C. Harris and M.J. Stephens. A combined corner and edge detector. In Alvey Vision Conference, pp. 147–152, 1988.

[35] B. Schauerte, and R. Stiefelhagen, “How the Distribution of Salient Objects in Images Influences Salient Object detection,” In Proceedings of the 20th International Conference on Image Proceesing (ICIP), The Technical Writer’s Handbook. Mill Valley, CA: University Science, 1989.

参考文献をもっと見る

分野

大学

学位論文種類・取得年

言語

Vehicle Detection based on Spatial Saliency and Local Image Features in H.265 (HEVC) 4K Video and Evaluation Model for Quality of Detection Accuracy

概要

この論文で使われている画像

関連論文

A Lightweight Network Applying in The Edge-Cloud System for Real-Time Object Detection

誘導パルス電圧と磁化過程の測定によるWiegandワイヤの磁気特性の解明

Research on application of a Faster R-CNN based on upper and lower layers in face detection

Multi-objective linear optimization problem for strategic planning of shared autonomous vehicle operation and infrastructure design

Coordination and control for connected and automated vehicles at signal-free intersections (本文)

参考文献

分野

大学

学位論文種類・取得年

言語

コピーが完了しました

URLをコピーしました

Vehicle Detection based on Spatial Saliency and Local Image Features in H.265 (HEVC) 4K Video and Evaluation Model for Quality of Detection Accuracy

概要

この論文で使われている画像

関連論文

A Lightweight Network Applying in The Edge-Cloud System for Real-Time Object Detection

誘導パルス電圧と磁化過程の測定によるWiegandワイヤの磁気特性の解明

Research on application of a Faster R-CNN based on upper and lower layers in face detection

Multi-objective linear optimization problem for strategic planning of shared autonomous vehicle operation and infrastructure design

Coordination and control for connected and automated vehicles at signal-free intersections (本文)

参考文献