Skip to content

Latest commit

 

History

History
105 lines (72 loc) · 4.61 KB

README.md

File metadata and controls

105 lines (72 loc) · 4.61 KB

Official PyTorch implementation of DeBiFormer, from the following paper:

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention. ACCV 2024.
Nguyen Huu Bao Long, Chenyu Zhang, Yuzhi Shi, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, and Tohgoroh Matsui

PWC

PWC

PWC


News

  • 2024-09-21: The paper has been accepted at ACCV 2024 !!!

Results and Pre-trained Models

ImageNet-1K trained models

name resolution acc@1 #params FLOPs model log
DeBiFormer-T 224x224 81.9 21.4 M 2.6 G model log
DeBiFormer-S 224x224 83.9 44 M 5.4 G model log
DeBiFormer-B 224x224 84.4 77 M 11.8 G model log

Usage

First, clone the repository locally:

git clone https://github.com/maclong01/DeBiFormer.git
pip3 install -r requirements.txt

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Training

To train DeBiFormer-S on ImageNet using 8 gpus for 300 epochs, run:

cd classification/
bash train.sh 8 --model debiformer_small --batch-size 256 --lr 5e-4 --warmup-epochs 20 --weight-decay 0.1 --data-path your_imagenet_path

Evaluation

To evaluate the performance of DeBiFormer-S on ImageNet using 8 gpus, run:

cd classification/
bash train.sh 8 --model debiformer_small --batch-size 256 --lr 5e-4 --warmup-epochs 20 --weight-decay 0.1 --data-path your_imagenet_path --resume ../checkpoints/debiformer_small_in1k_224.pth --eval

Acknowledgement

This repository is built using the timm library, DAT, and BiFormer repositories.

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Citation

If you find this repository helpful, please consider citing:

@InProceedings{BaoLong_2024_ACCV,
    author    = {BaoLong, NguyenHuu and Zhang, Chenyu and Shi, Yuzhi and Hirakawa, Tsubasa and Yamashita, Takayoshi and Matsui, Tohgoroh and Fujiyoshi, Hironobu},
    title     = {DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
    month     = {December},
    year      = {2024},
    pages     = {4455-4472}
}