Skip to main content

Object Detection

/img/content-concepts-raw-computer-vision-object-detection-slide29.png

Introduction

  • Definition: Object detection is a computer vision technique that allows us to identify and locate objects in an image or video.
  • Applications: Crowd counting, Self-driving cars, Video surveillance, Face detection, Anomaly detection
  • Scope: Detect objects in images and videos, 2-dimensional bounding boxes, Real-time
  • Tools: Detectron2, TF Object Detection API, OpenCV, TFHub, TorchVision

Models

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv, 2016.

SSD (Single Shot Detector)

SSD: Single Shot MultiBox Detector. CVPR, 2016.

YOLO (You Only Look Once)

YOLOv3: An Incremental Improvement. arXiv, 2018.

EfficientDet

EfficientDet: Scalable and Efficient Object Detection. CVPR, 2020.

It achieved 55.1 AP on COCO test-dev with 77M parameters.

Process flow

Step 1: Collect Images

Capture via camera, scrap from the internet or use public datasets

Step 2: Create Labels

This step is required only if the object category is not available in any pre-trained model or labels are not freely available on the web. To create the labels (bounding boxes) using either open-source tools like Labelme or any other professional tool.

Step 3: Data Acquisition

Setup the database connection and fetch the data into python environment

Step 4: Data Exploration

Explore the data, validate it and create preprocessing strategy

Step 5: Data Preparation

Clean the data and make it ready for modeling

Step 6: Model Building

Create the model architecture in python and perform a sanity check

Step 7: Model Training

Start the training process and track the progress and experiments

Step 8: Model Validation

Validate the final set of models and select/assemble the final model

Step 9: UAT Testing

Wrap the model inference engine in API for client testing

Step 10: Deployment

Deploy the model on cloud or edge as per the requirement

Step 11: Documentation

Prepare the documentation and transfer all assets to the client

Use Cases

Automatic License Plate Recognition

Recognition of vehicle license plate number using various methods including YOLO4 object detector and Tesseract OCR. Checkout the notion here.

Object Detection App

This is available as a streamlit app. It detects common objects. 3 models are available for this task - Caffe MobileNet-SSD, Darknet YOLO3-tiny, and Darknet YOLO3. Along with common objects, this app also detects human faces and fire. Checkout the notion here.

Logo Detector

Build a REST API to detect logos in images. API will receive 2 zip files - 1) a set of images in which we have to find the logo and 2) an image of the logo. Deployed the model in AWS Elastic Beanstalk. Checkout the notion here.

TF Object Detection API Experiments

The TensorFlow Object Detection API is an open-source framework built on top of TensorFlow that makes it easy to construct, train, and deploy object detection models. We did inference on pre-trained models, few-shot training on single class, few-shot training on multiple classes and conversion to TFLite model. Checkout the notion here.

Pre-trained Inference Experiments

Inference on 6 pre-trained models - Inception-ResNet (TFHub), SSD-MobileNet (TFHub), PyTorch YOLO3, PyTorch SSD, PyTorch Mask R-CNN, and EfficientDet. Checkout the notion here and here.

Object Detection App

TorchVision Mask R-CNN model Gradio App. Checkout the notion here.

Real-time Object Detector in OpenCV

Build a model to detect common objects like scissors, cups, bottles, etc. using the MobileNet SSD model in the OpenCV toolkit. It will task input from the camera and detect objects in real-time. Checkout the notion here. Available as a Streamlit app also (this app is not real-time).

EfficientDet Fine-tuning

Fine-tune YOLO4 model on new classes. Checkout the notion here.

YOLO4 Fine-tuning

Fine-tune YOLO4 model on new classes. Checkout the notion here.

Detectron2 Fine-tuning

Fine-tune Detectron2 Mask R-CNN (with PointRend) model on new classes. Checkout the notion here.