BatchNorm don't work as expected #2650
-
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
The result is not wrong 🙂 The difference lies in the training vs inference computation for a batchnorm module. If you use With burn you have to be explicit when using autodiff. In pytorch it's kind of the opposite, it will track the gradients and keep the autodiff graph by default unless you use This is also explained in the autodiff section. |
Beta Was this translation helpful? Give feedback.
-
Good explanation. Here is a relate description from pytorch: Is |
Beta Was this translation helpful? Give feedback.
The result is not wrong 🙂
The difference lies in the training vs inference computation for a batchnorm module. If you use
m.eval()
instead for the pytorch module you should get equivalent results.With burn you have to be explicit when using autodiff. In pytorch it's kind of the opposite, it will track the gradients and keep the autodiff graph by default unless you use
with torch.no_grad()
context.This is also explained in the autodiff section.