The project plans to analyze various transformer models using vision-based approaches with two distinct methodologies. The first methodology is training from scratch, the second method is fine-tuning the models with the chosen dataset. Chosen dataset is Butterfly & Months Image Classification 100 species from Kaggle.[1]
The dataset contains 12594 trains, 500 tests, and 500 validation images. Each image’s size is 224 x 224. There are 100 different classes in the dataset. Determined models to analyze:
- ViT(Vision-based Transformer)
- Swin Transformers
- Data-efficient Image Transformers(DeIT)
3 different methods are applied to implement models:
- From the Scratch
- Fine-tuning
- Feature Extraction
References
[1] Butterfly Image Classification 100 species, [Online].
Available: https://www.kaggle.com/datasets/gpiosenka/butterfly-images40-species