Object Detection
Introduction
- Definition: Object detection is a computer vision technique that allows us to identify and locate objects in an image or video.
- Applications: Crowd counting, Self-driving cars, Video surveillance, Face detection, Anomaly detection
- Scope: Detect objects in images and videos, 2-dimensional bounding boxes, Real-time
- Tools: Detectron2, TF Object Detection API, OpenCV, TFHub, TorchVision
Models
Faster R-CNN
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv, 2016.
SSD (Single Shot Detector)
SSD: Single Shot MultiBox Detector. CVPR, 2016.
YOLO (You Only Look Once)
YOLOv3: An Incremental Improvement. arXiv, 2018.
EfficientDet
EfficientDet: Scalable and Efficient Object Detection. CVPR, 2020.
It achieved 55.1 AP on COCO test-dev with 77M parameters.
Process flow
Step 1: Collect Images
Capture via camera, scrap from the internet or use public datasets
Step 2: Create Labels
This step is required only if the object category is not available in any pre-trained model or labels are not freely available on the web. To create the labels (bounding boxes) using either open-source tools like Labelme or any other professional tool.
Step 3: Data Acquisition
Setup the database connection and fetch the data into python environment
Step 4: Data Exploration
Explore the data, validate it and create preprocessing strategy
Step 5: Data Preparation
Clean the data and make it ready for modeling
Step 6: Model Building
Create the model architecture in python and perform a sanity check
Step 7: Model Training
Start the training process and track the progress and experiments
Step 8: Model Validation
Validate the final set of models and select/assemble the final model
Step 9: UAT Testing
Wrap the model inference engine in API for client testing
Step 10: Deployment
Deploy the model on cloud or edge as per the requirement
Step 11: Documentation
Prepare the documentation and transfer all assets to the client
Use Cases
Automatic License Plate Recognition
Recognition of vehicle license plate number using various methods including YOLO4 object detector and Tesseract OCR. Checkout the notion here.
Object Detection App
This is available as a streamlit app. It detects common objects. 3 models are available for this task - Caffe MobileNet-SSD, Darknet YOLO3-tiny, and Darknet YOLO3. Along with common objects, this app also detects human faces and fire. Checkout the notion here.
Logo Detector
Build a REST API to detect logos in images. API will receive 2 zip files - 1) a set of images in which we have to find the logo and 2) an image of the logo. Deployed the model in AWS Elastic Beanstalk. Checkout the notion here.
TF Object Detection API Experiments
The TensorFlow Object Detection API is an open-source framework built on top of TensorFlow that makes it easy to construct, train, and deploy object detection models. We did inference on pre-trained models, few-shot training on single class, few-shot training on multiple classes and conversion to TFLite model. Checkout the notion here.
Pre-trained Inference Experiments
Inference on 6 pre-trained models - Inception-ResNet (TFHub), SSD-MobileNet (TFHub), PyTorch YOLO3, PyTorch SSD, PyTorch Mask R-CNN, and EfficientDet. Checkout the notion here and here.
Object Detection App
TorchVision Mask R-CNN model Gradio App. Checkout the notion here.
Real-time Object Detector in OpenCV
Build a model to detect common objects like scissors, cups, bottles, etc. using the MobileNet SSD model in the OpenCV toolkit. It will task input from the camera and detect objects in real-time. Checkout the notion here. Available as a Streamlit app also (this app is not real-time).
EfficientDet Fine-tuning
Fine-tune YOLO4 model on new classes. Checkout the notion here.
YOLO4 Fine-tuning
Fine-tune YOLO4 model on new classes. Checkout the notion here.
Detectron2 Fine-tuning
Fine-tune Detectron2 Mask R-CNN (with PointRend) model on new classes. Checkout the notion here.