Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CustomRandomSampler not working in huggingface Trainer and Accelerator #31

Open
YanshekWoo opened this issue Apr 25, 2024 · 1 comment

Comments

@YanshekWoo
Copy link

Issue

When I test the , it seems that the huggingface Trainer and Accelerator will replace the Sampler by a new object.
Please refer to code: get_train_dataloader function in trainer and prepare_data_loader function in accelerate

When I try to print the sampler Class of dataloader before and after self.accelerator.prepare(), I get the following output:

<finetune.data.InTaskRandomSampler object at 0x7ff1a4b7c310>
<torch.utils.data.sampler.SequentialSampler object at 0x7ff1a4131c00>

Same issue can be found in https://discuss.huggingface.co/t/accelerator-prepare-replaces-custom-dataloader-sampler/43392.

Solution

A possible solution is to rewrite a torch.utils.data.distributed.DistributedSampler, and avoid using the self.accelerator.prepare in trainer. Of course it is necessary to rewrite the get_train_dataloader function in trainer .

@Muennighoff
Copy link
Collaborator

Sure feel free to open a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants