-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Round-trip encoding of tokens [!] failed, Warning: lexer error: too many states: 10406 >= 10000; stopping #1042
Comments
Hi @Crista23, sorry you're dealing with this! Which version of the package are you using? Are you on our release candidate / installing from source? Even if a tokenizer isn't explicitly specified, we do need one for guidance to work properly. For transformers based models, we try to load it automatically from the model config. However, sometimes this can act up, especially if there are new tokens added to a model's vocabulary via fine tuning (and not updated in the config...). Are you using a public/oss model? Do you mind sharing the link to it so that we can try to debug it on our side? |
HI @Harsha-Nori , thanks a lot for your answer! I have installed guidance --pre using pip and the version installed is 0.2.0rc1. I am using this in combination with publicly available models such as LLAMA-8B-Instruct instantiated in the code using
It has worked for a couple examples until it crashed with this error: "Round-trip encoding of tokens [!] failed, Warning: lexer error: too many states: 10406 >= 10000; stopping". It looks like a tokenizer issue and even though I tried to replace "!" with the empty string in the input it still fails. I would appreciate your thoughts on how to fix this, thank you! |
@Harsha-Nori Any thoughts? Sorry to ask again, it's a pressing issue. |
Hi @Crista23, I can't seem to replicate this with a llama-8B model :(. Could you share some more details about your code, including the exact huggingface model and/or details of the The error message can happen if the grammar you're constraining against is particularly complex, but I can't seem to replicate it on my side :(. Happy to also collaborate via email if you can't share publicly. |
@Crista23 if you can't share details of your prompt, would you be able to share the full traceback? Thanks! |
@Harsha-Nori @hudson-ai I get a similar warning when initializing the llama 8b instruct model with guidance 0.1.16 and transformers 4.45.2 from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer
import guidance.models
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
llama3 = guidance.models.Transformers(
"meta-llama/Meta-Llama-3-8B-Instruct",
quantization_config=bnb_config,
torch_dtype=torch.bfloat16,
device_map="auto",
) The warning is the following Can it be because the tokenizer encodes if len(encoded_str) != 1:
raise ValueError(f"Round-trip encoding of tokens [{token}] failed! Got {encoded_str}")``` |
@jtbuter thanks for the repro -- I am able to reproduce the warning with transformers 4.45.2 (interestingly, not with my previously installed 4.44.0 version). We have a few methods for converting tokens into a form that we need in order to support constrained decoding, and the warning here is just saying that our preferred approach is failing and falling back to an alternative approach. Will definitely look into what's going on under the hood here -- thank you for the suggestion on where to look. I think you have the right idea. Are you experiencing any downstream problems after seeing this warning? This being said, the @Crista23 are you able to share any details about the constraints you are using? I would love to see us (1) improve robustness and (2) provide more helpful exceptions and warnings. A concrete example of what's causing this would really help to that end. |
Thank you for the reply, I was not experiencing any other problems after this warning |
Hello @hudson-ai, @Harsha-Nori, I am having the same lexer state error as @Crista23 :
As I see this Issue is still open, I'll ask for details here. Here is what my piece of code looks like using models and gen from guidance, as simple as that. language_model = models.Transformers(
"google/gemma-2-9b-it",
device_map="auto",
max_memory={0: '30GiB', 'cpu': "80GiB"}
)
result = language_model + f'''
Q: {prompt}
A: {gen(
name="answer",
max_tokens=200
)}
''' This error happens on inference over big prompts (more than 3600 tokens roughly if you want to reproduce). It looks like the grammar has a size limit, but I can't find where to change the 10k state limitation.. See stacktrace below. Is there a way to bypass this limitation while still using guidance ? (using subgrammar or smth else)
Thanks for your help ! |
My code is throwing the error below:
I can see this error is thrown in the code here
https://github.com/guidance-ai/guidance/blob/main/guidance/models/transformers/_transformers.py#L233
and it looks like it's a tokenizer issue, however I am calling the guidance library without specifying a tokenizer
llm = models.Transformers(args.model_path, device_map="auto", trust_remote_code=True)
I am wondering how to fix this. Any advice appreciated, thanks!
The text was updated successfully, but these errors were encountered: