Implementation of MAE model in PyTorch for the IPU. This example is based on the models provided by the MAE
. The MAE model is based on the original paper Masked Autoencoders Are Scalable Vision Learners.
First, Install the Poplar SDK following the instructions in the Getting Started guide for your IPU system. Make sure to source the enable.sh
scripts for poplar and popART.
Then, create a virtual environment, install the required packages and build the custom ops.
virtualenv venv -p python3.6
source venv/bin/activate
pip install -r requirements.txt
cd remap
make clean
make
cd ..
Download the datasets:
- ImageNet dataset (available at http://www.image-net.org/)
The ImageNet LSVRC 2012 dataset, which contains about 1.28 million images in 1000 classes, can be downloaded from the ImageNet website. It is approximately 150GB for the training and validation sets. Please note you need to register and request permission to download this dataset on the Imagenet website. You cannot download the dataset until ImageNet confirms your registration and sends you a confirmation email. If you do not get the confirmation email within a couple of days, contact ImageNet support to see why your registration has not been confirmed. Once your registration is confirmed, go to the download site. The dataset is available for non-commercial use only. Full terms and conditions and more information are available on the ImageNet download
Please place or symlink the ImageNet data in ./data/imagenet1k
.
The imagenet1k dataset folder contains train
and validation
folders, in which there are the 1000 folders of different classes of images.
imagenet1k
|-- train [1000 entries exceeds filelimit, not opening dir]
`-- validation [1000 entries exceeds filelimit, not opening dir]
To run a tested and optimised configuration and to reproduce the performance shown on our performance results page, please follow the setup instructions in this README to setup the environment, and then use the examples_utils
module (installed automatically as part of the environment setup) to run one or more benchmarks. For example:
python3 -m examples_utils benchmark --spec <path to benchmarks.yml file>
Or to run a specific benchmark in the benchmarks.yml
file provided:
python3 -m examples_utils benchmark --spec <path to benchmarks.yml file> --benchmark <name of benchmark>
For more information on using the examples-utils benchmarking module, please refer to the README.
- Pretraining
- Finetuning
- Validation
Setup your environment as explained above. You can run MAE on ImageNet1k datasets.
Shell scripts that wrap up above arguments of the cluster setup and the python launcher configurations to run on POD16 and POD64 are provided. Use bash mae_base_pod16.sh
or bash mae_base_pod64.sh
to see meaning of the arguments. Specifically, these arguments are listed in the table below.
argument | meaning |
---|---|
-n | Hostnames/IPs of the hosts |
-s | Hostname/IP of the controller server |
-p | partition name |
-c | cluster name |
To run pre-training on a single host using POD16:
bash scripts/mae_base_pod16.sh
To run pre-training on multi-hosts using POD64:
bash scripts/mae_base_pod64.sh -n host1,host2,host3,host4 -s host0 -p partition_name -c cluster_name
Trained with the default fp16, the finetune accuracy is 83.37%(top1) after 1600 epochs. Trained with fp32, the finetune accuracy is 83.4%(top1) after 1600 epochs.
Once the training finishes, you can validate with finetune accuracy:
python main_finetune.py --finetune ${MODEL_DIR} --data_path ${IMAGENET_DIR} \
python finetune_validate.py --resume ${MODEL_DIR} --batch_size 16 --data_path ${IMAGENET_DIR}
This application is licensed under Apache License 2.0 and Attribution-NonCommercial 4.0 International. Please see the LICENSE file in this directory for full details of the license conditions.
The following files are created by Graphcore and are licensed under Apache License, Version 2.0 (* means additional license information stated following this list):
- main_pretrain.py
- options.py
- argparser.py
- configs.yml
- README.md
- requirements.txt
- core/utils.py
- core/gelu.py
- scripts/mae_base_pod16.sh
- scripts/mae_base_pod64.sh
- scripts/alignment.py
- scripts/alignment.sh
- scripts/eval.sh
- util/pos_embed.py
- util/crop.py
- util/checkpoint.py
- util/ipu_mixup.py
- test/test_mae.py
- test/conftest.py
- remap/remap_ops/TileMappingCommon.cpp
- remap/remap_ops/TileMappingCommon.hpp
- remap/remap_ops/remap_tensor_ce.cpp
- remap/remap_ops/remap_tensor_ce.hpp
The following files include code derived from this repo which uses Attribution-NonCommercial 4.0 International license:
- util/log.py
- util/lr_decay.py
- util/lr_sched.py
- util/datasets.py
- core/vision_transformer.py
- core/models_mae.py
- core/models_vit.py
- main_finetune.py
- finetune_validate.py
External packages:
transformers
is licenced under Apache License, Version 2.0pytest
is licensed under MIT Licensetimm
is licensed under MIT Licensetorchvision
is licensed under BSD 3-Clause License