In recent years, autonomous driving vehicles has become a hot spot in both commercial and scientific domains and has brought new challenges to the machine vision field. How to detect and recognize objects in a complex real-world road environment has become one of the most important problems facing autonomous vehicles and their ability to make decisions on the road. This requires the vehicle to be able to process 3D visual information fast and to recognize objects even in bad weather or under occlusion.
With the development of deep learning technology and the significant improvements made to computing performance, software and hardware support now exists for advanced 3D object detection and recognition tasks. In addition, LiDAR scanners can obtain high quality data under different lighting conditions and can provide high-range-high-precision spatial information. Designing a detector that can process data collected by a LiDAR scanner will bring new developments in the field of autonomous driving.
In this thesis, a 3D object detector is proposed, with focal loss and Euler angle regression to optimize the model’s performance, using bird’s-eye view (BEV) map generated from LiDAR point cloud and RGB images as input data. While preprocessing the point cloud data, height information is extracted using height thresholding and rescaled to enhance the detail on the range of interest and intensity information is extracted on an expanded BEV map that keeps the density information without an extra channel. Results show that the proposed 3D object detector has a speed over 46 FPS on a TESLA V100-PCIE GPU and an average precision over 90%.
In addition to the full-scale detector, a 3D mini detector is also proposed. The mini detector can process the same input data as the full-scale detector 3 times faster with slightly lower precision.