- 🔭 I’m currently working on Computer Vision models from Automatic Defect Recognitions (ADR).
- 📫 How to reach me: girin.iitm@gmail.com
Ranked 6th globally in the BMW SORDI.ai Hackathon 2022, one of the largest industrial AI competitions, with over 2,100 participants competing for four months. Organized by leading tech names including Microsoft, NVIDIA, BMW Group, and QUT Design Academy, this prestigious competition provided a platform to showcase advanced data science skills in a highly competitive environment.
In this project, a deep learning-based multiclass abnormality classification model was developed for Video Capsule Endoscopy (VCE) data. Trained on 37,607 VCE frames and validated on 16,132 frames, the model achieved a mean AUC of 0.98 and balanced accuracy of 0.83. Through hyperparameter tuning, data augmentation, and sampling techniques, the solution secured 8th place in the Capsule Vision Challenge 2024
Summer Challenge on Writer Verification, National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics 2023 (NCVPRIPG'23) - 2nd Position
The Writer Verification task originated as a method to detect potential fraud in the banking sector by verifying signatures. This is a challenging problem due to the natural variation in handwriting styles. The complexity increases in offline settings, where dynamic writing process data is unavailable. The challenge involved determining, from a pair of handwritten text images, whether they were written by the same person or by different writers.
ODII is a simple Python package designed to provide a unified and streamlined interface for running inference on multiple object detection models under one hood. ODII facilitates seamless interaction with a range of popular models, including YOLOX, YOLOv3, YOLOv4, YOLOv6, and YOLOv7, without the need to manage multiple codebases or installation processes.
This repository provides a user-friendly solution for training a Faster R-CNN model utilizing any custom COCO dataset. The Faster R-CNN algorithm is a widely used object detection framework known for its efficiency and accuracy in localizing and classifying objects within images. With this repository, one can seamlessly tailor the model to your specific needs. The primary focus of this repository is to streamline the process of training the Faster R-CNN model with any custom COCO dataset.
Implemented CANet (Chained Context Aggregation Network) for semantic segmentation of cells in microscopy imagery captured using light microscopy techniques. The model was trained on the LIVECell dataset to achieve accurate and robust segmentation results.
SAM-ONNX 🥭
A pip package designed to seamlessly integrate and leverage the potent capabilities of the SAM (Segment Anything Model), requiring only the most minimal dependencies.
This is a project where I use a CNN that can recognize the currency of different denominations. I have also implemented a Streamlit app for easy inference of the trained model. Some practical use cases of the model include:
- Currency recognition system for visually impaired/blind people
- The project has the potential to develop a currency verification system.
LW-μDCNN: A Lightweight CNN Model for Human Activity Classification using Radar micro-Doppler Signatures
Abstract:
Recognition of human activities plays a pivotal role in recent times for surveillance and security. The convolution neural network (CNN) based models are growing to classify human activities using micro- Doppler (μD) signatures. However, a larger number of parameters of the CNN models increases the computation cost and increases the size. The present work introduces a novel lightweight model, “LW −μ DCNN,” to classify human activities. The architecture of LW −μ DCNN has 438998 parameters with 7 layers. A total of six human activities are recorded in the FM CWR dataset, which is in the form of μD signatures. These μD signatures are converted into spectrogram images and are considered as input for the experiments. The size of the LW −μ DCNN model is only 5.2 MB, which is further optimized by considering quantization aware training, “QAT-LW- μ DCNN,” has size of 0.43 MB with minimal loss of accuracy. The extensive analysis shows that the LW −μ DCNN model achieves 97% of classification accuracy with a higher F1-score for every class than the other state-of-the-art models. The present paper also proposed two transfer learning approaches, i.e., InceptionV3 and MobileNetV1, for the experimental studies to classify human activities.