You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
probably going to introduce a breaking change in 0.6 or 0.7 to how models are selected
Now that OpenAI is releasing different versions of models (3.5-turbo has 4 current versions between the token limit and the different iterations) I think model configuration/fallback needs to change a bit.
If 3.5 parsed the results and didn't do well, the next fallback should be 4, if the token limit was exceeded however it should go to 3.5-16k. There are a lot of possible conditions for this, and people that want 100% control can explicitly pass a single model, but the way the fallback chain is traversed can improve to reduce redundant requests.
This would try gpt-3.5-turbo only once, either at 4k or 16k based on input.
Then it would try gpt-4.
Selecting particular revisions could work this way as well.
Probably makes the most sense to do this as part of #18
The text was updated successfully, but these errors were encountered:
jamesturk
changed the title
adjust how models are selected
breaking change: adjust how models are selected
Jun 13, 2023
probably going to introduce a breaking change in 0.6 or 0.7 to how models are selected
Now that OpenAI is releasing different versions of models (3.5-turbo has 4 current versions between the token limit and the different iterations) I think model configuration/fallback needs to change a bit.
If 3.5 parsed the results and didn't do well, the next fallback should be 4, if the token limit was exceeded however it should go to 3.5-16k. There are a lot of possible conditions for this, and people that want 100% control can explicitly pass a single model, but the way the fallback chain is traversed can improve to reduce redundant requests.
Something like:
models=[GPT35T(allow_16k=True), GPT4(allow_32k=False)]
This would try gpt-3.5-turbo only once, either at 4k or 16k based on input.
Then it would try gpt-4.
Selecting particular revisions could work this way as well.
Probably makes the most sense to do this as part of #18
The text was updated successfully, but these errors were encountered: