You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I would like to leverage guidance on vlm models such as qwen2-vl or llama3.2-vision.
However, these are difficult to use because they use the latest transformer's MllamaForConditionalGeneration and Qwen2VLForConditionalGeneration, which causes an error in the current guidance.
Is it possible to update these to support them more widely?
Additional context
Or is it because I am not good at using it?
If anyone has solved it, please let me know the solution.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I would like to leverage guidance on vlm models such as qwen2-vl or llama3.2-vision.
However, these are difficult to use because they use the latest transformer's MllamaForConditionalGeneration and Qwen2VLForConditionalGeneration, which causes an error in the current guidance.
Is it possible to update these to support them more widely?
Additional context
Or is it because I am not good at using it?
If anyone has solved it, please let me know the solution.
The text was updated successfully, but these errors were encountered: