Skip to content

Project about developing a model that can detect multiple cards and identify their suit and rank. Includes multiple models, own created dataset, utility functions, presentations, research paper and live demo application. Course Deep Learning, FMI, 2024.

Notifications You must be signed in to change notification settings

TeogopK/Playing-Cards-Object-Detection

Repository files navigation

Playing-Cards-Object-Detection

This repository encapsulates the whole process of training and evaluating various YOLOv8 models for playing cards object detection, including datasets, code for training, utility functions, presentations and research paper materials, the model themselves and live demo application with the best models.

Datasets

All playing cards datasets used during the training of the models are available in the data directory. All datasets are in YOLOv8 object detection format, split on train, valid and test directories with labels and images subdirectories. All of them fall under the CC0: Public Domain license.

The Synthetic dataset

The "Real" dataset

  • Created by the author of the project - Teodor Kostadinov
  • Includes 100 images shot and labelled by the author using Label Studio with 13 classes.
  • Used to train the YOLOv8m_real and YOLOv8m_tuned model.

The "Real" Augmented dataset

  • Created using imgaug with the script augment_dataset.ipynb
  • Introduces 10 augmented images for each image in the "Real" dataset using different transformations.
  • Used to train the YOLOv8m_aug model.

The Combined dataset

Notes

  • To use the datasets, one may need to replace the relative paths provided in the data.yaml file.
  • The provided test.yaml files have the same structure as the data.yaml ones but are used to execute the model on the test set. This is done by replacing the path to the validation set with the path to the test set.
  • All model runs has old project structure datasets path in the args configuration file.

Models

Within the runs directory, you'll find all the data extracted from the training process of each model. The whole process of training, validating and testing the models is executed using the scripts in model_utils.

The table summarizes the models, datasets, epochs, and training times on NVIDIA RTX A2000 8GB:

Model Dataset Epochs Training Time
YOLOv8m_synthetic 20,000 synthetic images 10 2 hours
YOLOv8m_real 100 real images 100 20 minutes
YOLOv8m_aug 1,000 augmented images 100 40 minutes
YOLOv8m_comb 100 real + 1,000 synthetic 100 50 minutes
YOLOv8m_tuned 100 real images (fine-tuned) 100 10 minutes

The best models are presented as pretrained files in the directory final_models. They are extracted from each models train/weights/best.pt to be used in the live demo application.

Live demo application

The live demo application integrates the best performing models to detect the cards using the machine web cam.

To run the application create an environment using:

python -m venv env

Then install all requirements specified in the requirements.txt file using.:

pip install -r requirements.txt

Some of these requirements are required for running the utility functions and not just the application.

Note that to use CUDA after installing it, check your version using:

nvcc --version

Then proceed to download the correct pytorch version e.g.:

pip uninstall torch   
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

This should enable CUDA.

import torch
torch.cuda.is_available()

There are two models that the application can use - YOLOv8m_synthetic and YOLOv8m_tuned. Run the program specifying the model using:

python demo_application/visualization.py <synthetic_or_tuned>

Alternatively use your IDE GUI to start the application. The app will use a default value for the model parameter

To quit the program press q on your keyboard.

Presentations and research paper

The presentations folder consists of all materials required for creating and managing the presentations and research paper, including markdown, latex and media files.

The final presentation includes detailed information, extracted metrics and test images.

The main points are available in the research paper as well.

About

Project about developing a model that can detect multiple cards and identify their suit and rank. Includes multiple models, own created dataset, utility functions, presentations, research paper and live demo application. Course Deep Learning, FMI, 2024.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published