Skip to content

Latest commit

 

History

History
51 lines (32 loc) · 3.97 KB

README.md

File metadata and controls

51 lines (32 loc) · 3.97 KB

HyperMask: Adaptive Hypernetwork-based Masks for Continual Learning

Generate a semi-binary mask for a target network using a hypernetwork.

Scheme of HyperMask method

Use environment.yml file to create a conda environment with necessary libraries. One of the most essential packages is hypnettorch which should easy create hypernetworks in PyTorch.

DATASETS

The implemented experiments uses four publicly available datasets for continual learning tasks: Permuted MNIST, Split MNIST, Split CIFAR-100 and Tiny ImageNet. The datasets may be downloaded when the algorithm runs.

USAGE

The description of HyperMask is included in the paper. To perform experiments with the use of the best hyperparameters found and reproduce the results from the publication for five different seed values, one should run main.py file with the variable create_grid_search set to False and the variable dataset set to PermutedMNIST, SplitMNIST, CIFAR100 or TinyImageNet. In the third and fourth cases, as a target network ResNet-20 or ZenkeNet can be selected. To train ResNets, it is necessary to set part = 0, while to prepare ZenkeNets, one has to set part = 1. In the remaining cases, the variable part is insignificant.

Also, to prepare experiments with CIFAR100 according to the FeCAM scenario, one should set the variable dataset in main.py to CIFAR100_FeCAM_setup with part = 6 to run training with a ResNet model or part = 7 to train a ZenkeNet model.

One can also easily perform hyperparameter optimization using a grid search technique. For this purpose, one should set the variable create_grid_search to True in main.py file and modify lists with hyperparameters for the selected dataset in datasets.py file.

CITATION

If you use this library in your research project, please cite the following paper:

@misc{książek2023hypermask,  
     title={HyperMask: Adaptive Hypernetwork-based Masks for Continual Learning},  
     author={Kamil Książek and Przemysław Spurek},  
     year={2023},  
     eprint={2310.00113},  
     archivePrefix={arXiv},  
     primaryClass={cs.LG}  
}

LICENSE

Copyright 2023 Institute of Theoretical and Applied Informatics, Polish Academy of Sciences (ITAI PAS) https://www.iitis.pl and Group of Machine Learning Research (GMUM), Faculty of Mathematics and Computer Science of Jagiellonian University https://gmum.net/.

Authors:

  • Kamil Książek (ITAI PAS, ORCID ID: 0000−0002−0201−6220),
  • Przemysław Spurek (Jagiellonian University, ORCID ID: 0000-0003-0097-5521).

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

The HyperMask repository includes parts of the code that come or are based on external sources: hypnettorch, FeCAM, Tiny ImageNet preprocessing 1 and Tiny ImageNet preprocessing 2.