-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]] #48
Comments
i just checked and the command under GRIT here https://github.com/ContextualAI/gritlm?tab=readme-ov-file#run works fine for me |
Thanks for the quick response! Here is my config: I printed out my toy data before entering the tokenizer: [default6]:['He He Me It I You You You You You', 'Me I He You Me He It Me It She'] We can see that there is an extra None in the batch of generative data, which should be the cause of the error. Why is this? Is it related to the following warning? [default4]:/home/code/.python_libs/conda_env/myenv/lib/python3.9/site-packages/accelerate/accelerator.py:447: FutureWarning: Passing the following arguments to |
It seems like you're using your own custom data? Maybe you have None's in your data |
I checked my toy data and it does not contain None value. In addition, I used the toy data you provided and it also reported this error, and the batch contained None: [default1]:['What is the difference between a raspberry pi and an esp32? What is better suited for interfacing with a SD card? The Raspberry Pi is a single-board computer that runs a full-fledged operating system, while the ESP32 is a microcontroller that is typically used for IoT applications. The Raspberry Pi is better suited for interfacing with an SD card as it has a full-fledged operating system and a large number of libraries available for interfacing with various peripherals, including SD cards. The ESP32, on the other hand, has limited memory and processing power, and may require more effort to interface with an SD card.', None] |
@Muennighoff |
Hi, I roughly looked at the code in gritlm/gritlm/training |
Thank you for your contribution. I encountered the following error when training with toy data:
TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]
I read online that the following reasons may be the cause:
However, I tried the solutions corresponding to the above 4 reasons, and this error is still reported. I want to know why. Thank you very much!
The text was updated successfully, but these errors were encountered: