Skip to content

Adversarial attack toolbox for Pytorch, Tensorflow, and Jax

License

Notifications You must be signed in to change notification settings

metehancekic/deep-illusion

Repository files navigation

alt text

Adversarial Machine Learning

With the advent of more powerful parallel computation units and huge data, we are able to train much more complex and expressive deep neural networks. That is said, deep neural nets (DNN) found its use in a wide variety of fields, ranging from computer vision to game playing agents. They are performing better on some tasks than even human experts in those fields. Despite their incredible success, it is by now well known that they are susceptible to small and carefully designed perturbations which are imperceptible to humans. The fact that DNN's can easily be fooled is a great problem since they are also used in security critical applications such as self-driving cars. Recently, research community has put a great effort to robustify neural networks against these adversarial examples. Despite great attention of research community, there is not a powerful defense mechanism found, and it is shown that defending against adversarial examples are not an easy goal.

As another group working on this field, we share our attack codes as a library. This library is a side product of our research, and since we use this in our research as well, we made sure it works correctly and as mentioned in the original papers. To sum up, deepillusion contains easy to use and properly implemented adversarial methods.

We are open to suggestions "[email protected]".

Deep Illusion

Deep Illusion is a toolbox for adversarial attacks in machine learning. Current version is only implemented for Pytorch models. DeepIllusion is a growing and developing python module which aims to help adversarial machine learning community to accelerate their research. Module currently includes complete implementation of well-known attacks (PGD, FGSM, R-FGSM, CW, BIM etc..). All attacks have an apex(amp) version which you can run your attacks fast and accurately. We strongly recommend that amp versions should only be used for adversarial training since it may have gradient masking issues after neural net gets confident about its decisions. All attack methods have an option (Verbose: False) to check if gradient masking is happening.

All attack codes are written in functional programming style, therefore, users can easily call the method function and feed the input data and model to get perturbations. All codes are documented, and contains the example use in their description. Users can easily access the documentation by typing "??" at the and of the method they want to use in Ipython (E.g FGSM?? or PGD??). Output perturbations are already clipped for each image to prevent illegal pixel values. We are open to contributers to expand the attack methods arsenal.

We also include the most effective current approach to defend DNNs against adversarial perturbations which is training the network using adversarially perturbed examples. Adversarial training and testing methods are included in torchdefenses submodule.

Current version is tested with different defense methods and the standard models for verification and we observed the reported accuracies.

Maintainers: WCSL Lab, Metehan Cekic, Can Bakiskan, Soorya Gopal, Ahmet Dundar Sezer

Dependencies

numpy 1.16.4
tqdm 4.31.1

torchattacks

pytorch 1.4.0
apex 0.1 (optional)

tfattacks

tensorflow

jaxattacks

jax

Installation

The most recent stable version can be installed via python package installer "pip", or you can clone it from the git page.

pip install deepillusion

or

git clone [email protected]:metehancekic/deep-illusion.git

Example Use

As mentioned earlier, our adversarial methods are functional instead of modular type. Therefore, all you need to get the perturbations is feeding input data and its labels along with the attack parameters.

To standardize the arguments for all attacks, methods accept attack parameters as a dictionary named as attack_params which contains the necessary parameters for each attack. Furthermore, attack methods get the data properties such as the maximum and the minimum pixel value as another dictionary named data_params. These dictinaries make function calls concise and standard for all methods.

Following code snippets show PGD and FGSM usage.

from deepillusion.torchattacks import PGD, FGSM, RFGSM

##### PGD ######
data_params = {"x_min": 0., "x_max": 1.}
attack_params = {
    "norm": "inf",
    "eps": 8./255,
    "step_size": 2./255,
    "num_steps": 7,
    "random_start": False,
    "num_restarts": 1}
    
pgd_args = dict(net=model,
                x=data,
                y_true=target,
                data_params=data_params,
                attack_params=attack_params,
                verbose=False,
                progress_bar=False)               
perturbs = PGD(**pgd_args)
data_adversarial = data + perturbs

##### FGSM #####
data_params = {"x_min": 0., "x_max": 1.}
attack_params = {"norm": "inf",
                 "eps": 8./255}
fgsm_args = dict(net=model,
                 x=data,
                 y_true=target,
                 data_params=data_params,
                 attack_params=attack_params)
perturbs = FGSM(**fgsm_args)
data_adversarial = data + perturbs

Analysis tools come handy when one needs to evaluate his/her model against adversarial examples. Whitebox and blackbox test functions are inside analysis can be used as follow.

from deepillusion.torchattacks import PGD, FGSM, RFGSM, BIM, PGD_EOT
from deepillusion.torchattacks.analysis import whitebox_test, substitute_test

##### PGD ######
data_params = dict(x_min= 0., 
                   x_max=1.)
                   
attack_params = dict(norm="inf",
                     eps=0.3,
                     alpha=0.4,
                     step_size=0.01,
                     num_steps=100,
                     random_start=False,
                     num_restarts=1,
                     EOT_size=20)

attack_args = dict(data_params=data_params,
                   attack_params=attack_params,
                   loss_function="cross_entropy",
                   verbose=False)

adversarial_args = dict(attack=PGD,
                        attack_args=attack_args)

whitebox_test_args = dict(model=model,
                          test_loader=test_loader,
                          adversarial_args=adversarial_args,
                          verbose=True,
                          progress_bar=True)

attack_loss, attack_acc = whitebox_test(**whitebox_test_args)

substitute_test_args = dict(model=model,
                            substitute_model=another_model,
                            test_loader=test_loader,
                            adversarial_args=adversarial_args,
                            verbose=True,
                            progress_bar=True)

attack_loss, attack_acc = substitute_test(**substitute_test_args)

Last but not least, you can check if the perturbations are legal by using get_perturbation_stats:

from deepillusion.torchattacks.analysis import get_perturbation_stats

get_perturbation_stats_args = dict(clean_data=clean_data, 
                                   adversarial_data=adversarial_data, 
                                   epsilon=epsilon, 
                                   norm="inf", 
                                   verbose=True)

perturbation_properties = get_perturbation_stats(**get_perturbation_stats_args)

Update

Deepillusion is a growing and developing library, therefore we strongly recommend to upgrade deepillusion regularly:

pip install deepillusion --upgrade

Current Version

0.3.2

Module Structure

In case investigation of the source codes are needed, this is how our module is structured:

deep-illusion
│   README.md
│
|───deepillusion
|   |   _utils.py               Utility functions
|   |
|   |───torchattacks
|   |   │   _fgsm.py                     Fast Gradient Sign Method
|   |   │   _rfgsm.py                    Random Start + Fast Gradient Sign Method
|   |   │   _pgd.py                      Projected Gradient Descent
|   |   │   _bim.py                      Basic Iterative Method
|   |   │   _soft_attacks.py             Soft attack functions
|   |   │ 
|   |   |───amp
|   |   |   │   _fgsm.py                     Mixed Precision (Faster) - Fast Gradient Sign Method
|   |   |   │   _rfgsm.py                    MP - Random Start + Fast Gradient Sign Method
|   |   |   │   _cw.py                       MP - Carlini Wagner Linf
|   |   |   │   _pgd.py                      MP - Projected Gradient Descent
|   |   |   |   _soft_attacks.py             MP - Soft attack functions
|   |   |
|   |   └───analysis
|   |       │   _perturbation_statistics     Perturbations statistics functions
|   |       │   _evaluate                    Whitebox, blackbox evaluations codes (test functions)
|   |       │   
|   |       └───plot 
|   |           │   _loss_landscape.py       loss landscape plotter
|   |           │   
|   |
|   |───torchdefenses
│   |   |   _adversarial_train.py       Adversarial Training - Adversarial Testing
│   |   |   _trades_train.py            Trades Training - Trades Loss
|   |   │   
|   |   └───amp
|   |       │   _adversarial_train.py     MP (Faster) - Adversarial Training - Adversarial Testing 
|   |
|   |───tfattacks
|   |   |
|   |
|   └───jaxattacks
|       |
|
└───tests
    |   test_....py                         Test functions

Sources

About

Adversarial attack toolbox for Pytorch, Tensorflow, and Jax

Resources

License

Stars

Watchers

Forks

Packages

No packages published