Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransferLearning Cifar10 #42

Open
maryamag85 opened this issue Apr 20, 2021 · 3 comments
Open

TransferLearning Cifar10 #42

maryamag85 opened this issue Apr 20, 2021 · 3 comments

Comments

@maryamag85
Copy link

thanks for your nice work!
I have some issue following the repo and run the code for transfer learning.
I get the pretrain model for cifar10 from the link and locate it in the appropriate location and then when I run the following code from repo:
CUDA_VISIBLE_DEVICES=0,1 transfer_learning.py --lr 0.05 --b 64 --num-classes 10 --img-size 224 --transfer-learning True --transfer-model /path/to/pretrained/T2T-ViT-19

(model path adjusted)
I face some issue like the state dictionary does not have same name as it is used in the model. I do not understand why this can happen. Is there any mismatch between code and pretrained model architecture?

del state_dict['head' + '.weight']
KeyError: 'head.weight'

@yuanli2333
Copy link
Collaborator

Hi,
What is the pretained model you used for cifar10? I guess you used wrong pretrained models.
If you want to transfer training our t2t-vit on cifar, you should use the t2t-vit pretrained on ImageNet, so you should use the model at here.

@maryamag85
Copy link
Author

Thanks for clarification. I am still a bit unclear about naming convention you have in the repo. When the model is already started with cifar, I assumed it is trained on cifar or fine tuned on cifar.
When I want to transfer a pretrained model like imagenet, regardless of the downstream task, it has a unique name.
Also in the main.py and transfer_learning.py there are many mismatch variables specially in args and the main body, args like transfer_model is defined with _ and then used in the body like transfer-learning with -. which makes the code inconsistent. (there are several examples like this in the code)
in the main.py, that you mentioned can be used for validation, it always initialize the train loader etc, while in validation mode, there should not be a need to go over all training steps. In my opinion. I hope I am clear bout my points.
Finally, after fixing those syntax, I was able to fine-tune the pretrained model on image on my own dataset, but in validation time, the result is not good. Would you mind checking the code thoroughly and giving me some hint, where the issue can be?
I use 81.5_T2T_ViT_14.pth.tar model as a pretrained model, then I setup dataloader to load my data and fine tune the model on my own data. Then with the main.py sample code I try to get the final check point of the transfer learning and test on my own dataset. with gives me around 50% accuracy. I strictly follow the sample code you provided with minor fixes I mentioned.

Thank you for your time and effort. I hope this wonderful repo gets mature sooner and people can leverage this model easier.

@yuanli2333
Copy link
Collaborator

yuanli2333 commented May 2, 2021

HI,

Thanks for your suggestions!
I would make the name of models be more consisitent recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants