The goad of this project is to implement a robust pipeline capable of detecting moving vehicles in real-time. Even though the project was designed for using classic Computer Vision techniques, namely HOG features and SVM classifier, in agreement the course organizers, I decided like a few other students to go for a deep learning approach.
Several important papers on object detection using deep convolutional networks have been published the last few years. More specifically, Faster R-CNN, YOLO and Single Shot MultiBox Detector are the present state-of-the-art in using CNN for real-time object detection.
Even though there are a few differences between the three previous approaches, they share the same general pipeline. Namely, the detection network is designed based on the following rules:
- Use a deep convolutional network trained on ImageNet as a multi-scale source of features. Typically, VGG, ResNet or Inception;
- Provide a collection of pre-defined anchors boxes tiling the