This study proposes a method of achieving object detection and classification performance stability of the recognition system of autonomous vehicles through a DNN, deep learning, and sensor fusion. The recognition system in autonomous driving consists...
This study proposes a method of achieving object detection and classification performance stability of the recognition system of autonomous vehicles through a DNN, deep learning, and sensor fusion. The recognition system in autonomous driving consists of environment RADAR, Camera, LiDAR, and an ultrasonic sensor. Sensor fusion can overcome the limitations of the sensors while reducing uncertainty.
Fusion between the same types of sensors is generally used to secure data by expanding the measurement area of the sensor. Convergence between the same types is possible because the characteristics of the data are the same. Sensor fusion between different types requires the fusion of data of different characteristics to determine the connection point between them. For a camera and LiDAR, the three-dimensional data of LiDAR are fused to the two-dimensional camera data. Because sensor data are of different dimensions, ambiguity and errors in the fusion results are generated. Research is required to solve this problem of fusion that occurs due to dimension reduction in the fusion between different types of sensors.
This study constructed a system with an independent late fusion method with sensor detection results to solve the above-mentioned problem. A method of object fusion through dimension matching using semantic segmentation information of a camera and LiDAR was developed. The camera system consists of two networks. A network for estimating depth was used for Pseudo-LiDAR. This study used the stereo method. Pseudo-LiDAR was implemented using the estimated depth and extrinsic parameters, which can be determined through the correlation of LiDAR with respect to a camera. Object classification of a camera can be distinguished into object detection and semantic segmentation. In this study, objects were classified using DeepLabV3+, which is a semantic segmentation network. Three-dimensional object detection and classification information are provided through the two networks. A system based on LiDAR uses a single network for object detection and classification. A DeepLabV3+ network from a previous study was modified to be appropriate for LiDAR data to be used in this study. The system provides object detection and classification data for LiDAR data. From the inference results of semantic segmentation of LiDAR and depth from images obtained using a camera, instance segmentation was inferred to access instances of objects for which a method to demarcate objects’boundaries is proposed. The sensor fusion method uses a late fusion method that fuses independent results of each system. The sensors are fused by comparing the area of the detected object through a Bird's Eye view representation.
This paper proposes a method of integrating data collected from sensors with different dimensions, and the system was verified using the KITTI dataset, which is an open dataset.