Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
hashing		hashing
mcloader		mcloader
scripts		scripts
README.md		README.md
datasets.py		datasets.py
engine.py		engine.py
get_energy.py		get_energy.py
get_flops.py		get_flops.py
gpu_mem_track.py		gpu_mem_track.py
infer_tvm.py		infer_tvm.py
logger.py		logger.py
losses.py		losses.py
main.py		main.py
modelsize_estimate.py		modelsize_estimate.py
params.py		params.py
performer.py		performer.py
pvt.py		pvt.py
pvt_v2.py		pvt_v2.py
run_with_submitit.py		run_with_submitit.py
samplers.py		samplers.py
utils.py		utils.py

README.md

Code for EcoFormer on PVTv2

Dataset Preparation

Download the ImageNet 2012 dataset from here, and prepare the dataset based on this script. The file structure should look like:

imagenet
├── train
│   ├── class1
│   │   ├── img1.jpeg
│   │   ├── img2.jpeg
│   │   └── ...
│   ├── class2
│   │   ├── img3.jpeg
│   │   └── ...
│   └── ...
└── val
    ├── class1
    │   ├── img4.jpeg
    │   ├── img5.jpeg
    │   └── ...
    ├── class2
    │   ├── img6.jpeg
    │   └── ...
    └── ...

Training

Activate your python environment

conda activate ecoformer

Train a PVTv2 model (e.g., PVTv2 B0) with standard self-attention under 100 epochs. The model is initialized with corresponding pre-trained models in PVT.

# train with 8 GPUs
python -m torch.distributed.launch --nproc_per_node=8 --master_port=1236 \
    --use_env main.py \
    --config configs/pvt_v2/pvt_v2_b0_msa.py \
    --batch-size 32 \
    --data-path [path of imagenet] \
    --data-set IMNET \
    --epochs 100 \
    --lr 5e-5 \
    --warmup-lr 1e-7 \
    --min-lr 1e-6 \
    --finetune [path of pvt_v2 pre-trained models]

Finetune the pre-trained models obtained in Step 2 with our EcoFormer.

python -m torch.distributed.launch --nproc_per_node=8 --master_port=1236 \
    --use_env main.py \
    --config configs/pvt_v2/pvt_v2_b0_ecoformer.py \
    --batch-size 32 \
    --data-path [path of imagenet] \
    --data-set IMNET \
    --epochs 30 \ # note the difference
    --lr 5e-5 \
    --warmup-lr 1e-7 \
    --min-lr 1e-6 \
    --finetune [path of the pre-trained model in Step 2]

Evaluation

To evaluate a model, you can

python -m torch.distributed.launch --nproc_per_node=1 --master_port=1236 \
    --use_env main.py \
    --config configs/pvt_v2/pvt_v2_b0_ecoformer.py \
    --batch-size 32 \
    --data-path [path of imagenet] \
    --data-set IMNET \
    --resume [path/to/trained_weights] \
    --eval

To test the throughput, you can

python -m torch.distributed.launch --nproc_per_node=1 --master_port=1236 \
    --use_env main.py \
    --config configs/pvt_v2/pvt_v2_b0_ecoformer.py \
    --batch-size 32 \
    --data-path [path of imagenet] \
    --data-set IMNET \
    --throughput

To obtain the number of multiplication, addition and energy, run

python get_energy.py --config configs/pvt_v2/pvt_v2_b0_ecoformer.py

Results and Models

Model	#Mul. (B)	#Add. (B)	Energy (B pJ)	Throughput (images/s)	Top-1 Acc. (%)	Download
PVTv2-B0	0.54	0.56	2.5	1379	70.44	Github
PVTv2-B1	2.03	2.09	9.4	874	78.38	Github
PVTv2-B2	3.85	3.97	17.8	483	81.28	Github
PVTv2-B3	6.54	6.75	30.25	325	81.96	Github
PVTv2-B4	9.57	9.82	44.25	249	81.90	Github

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Acknowledgement

This repository is built upon PVT. We thank the authors for their open-sourced code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pvt

pvt

README.md

Code for EcoFormer on PVTv2

Dataset Preparation

Training

Evaluation

Results and Models

License

Acknowledgement

Files

pvt

Directory actions

More options

Directory actions

More options

Latest commit

History

pvt

Folders and files

parent directory

README.md

Code for EcoFormer on PVTv2

Dataset Preparation

Training

Evaluation

Results and Models

License

Acknowledgement