-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU memory usage for training #24
Comments
Megactor requires just 32GB of VRAM for training. In fact, our experimental setup during training consisted of 8 V100 GPUs. If you encounter a GPU Out of Memory situation, there could be several reasons for this, such as other processes occupying the GPU. Or you can turn off the motion layer in your 2D traing stage, and then turn on the motion layer in 3D training stage (The open-source version is a little bit differnce from our paper, because we find it's also ok for training 2D & 3D at the same time. Your can train megactor on your favorite.): |
Thanks for your quick reply! I'm not very familiar with the deepspeed setting, should I uncomment these lines, it seems the training doesn't use megactor/configs/accelerate_deepspeed.yaml Lines 23 to 35 in 16e7cdf
|
Hello, have you succeeded in replicating? When I was processing the data set, there was a problem in the fourth part, the size of the generated swapped.mp4 videos are all 0, can you share the videos you generated? |
I have a question about the GPU memory usage for model training. I'm using a V100 32GB GPU, but I'm encountering "CUDA out of memory" errors when training for the first stage with default setting. This happens even when I set the gradient_accumulation_steps to 1. I would like to know how much VRAM is really needed for model training. I'm not sure if there's something wrong in my setup because your paper mentions that you also used V100 GPUs for training.
The text was updated successfully, but these errors were encountered: