Networks Utilized
The project utilizes the following networks:
-
Object Detection (cars, traffic signs, traffic lights, speed limit sign): A TensorFlow-based object detection model from xiaogangLi/tensorflow-MobilenetV1-SSD is used to detect objects in the scene. The midpoints of the bounding boxes are used to identify the position of cars, traffic signs, and traffic lights. Instance/panoptic segmentation network is planned to be used for better depth estimation in future iterations.
-
Depth Estimation: A transformer-based midas model from jankais3r/Video-Depthify is used to estimate the depth of the objects in the scene. The average depth is calculated within the region defined by the bounding boxes from object detection. Instance segmentation is planned to be used for better results in future iterations.
-
Lane Detection: YOLO Pv2 from CAIC-AD/YOLOPv2 is used to detect lanes in the scene.