Skip to content

WouterBant/GEVit-DL2-Project

 
 

Repository files navigation

E(2) Equivariant Attention Models for Image Classification

Introduction

Inspired by the group equivariant attention model presented in Group Equivariant Vision Transformer (GE-ViT), we conduct experiments to validate the performance of the presented model. We provide many visualizations for a better understanding of GE-ViTs and other presented methods. Furthermore, we present and evaluate several ways of making non-equivariant models equivariant by combining the latent embeddings or probabilities of different transformed inputs. We also speed up the experiments with GE-ViT by first projecting the image to an artificial image with smaller spatial resolution.

For the full analysis see our blogpost, but to give a little preview:

  • 🎯 Evaluate and propose novel ways of making any image classification model globally E(2) equivariant, outperform regular test time augmentation, and beat previous image classification benchmarks:
  • ⚡ Speed up and improve GE-ViTs by projecting the image to an artificial image with lower spatial resolution for less attention computations:

Reproducing results

Installation

Getting the code

Clone the repository:

git clone https://github.com/WouterBant/GEVit-DL2-Project.git

And go inside the directory:

cd GEVit-DL2-Project

Getting the environment

Unfortunately we had to use two different environments. For running the GE-ViT you can create the environment with:

conda env create -f gevit_conda_env.yml 
conda activate gevit

For running the post hoc experiments and training of the equivariant modern ViT:

conda env create -f posthoc_conda_env.yml
conda activate posthoc

Demos

In the demos folder we provide notebooks for visualizing the artifacts for non 90 degree rotations and creating the video comparing normal ViT to equivariant models for rotated inputs.

Running experiments

For reproducing the results for GE-ViT, change directory to the src folder and execute the commands from the README.

For reproducing the results of the post hoc experiments, change directory to src/post_hoc_equivariant and follow the instructions from the README. Also, in this folder checkpoints and results are saved for various models.

For reproducing the results of the modern equivariant ViT, change directory to src/modern_eq_vit and refer to the README for instructions to run the training.

Important files

Acknowledgements

This repository contains the source code accompanying the paper: Group Equivariant Vision Transformer, UAI 2023.

The original code, containing a small error, is from the GSA-Nets paper by David W. Romero and Jean-Baptiste Cordonnier.

About

SE(2) Equivariant Vision Transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 92.4%
  • Python 7.6%