You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Latency and architecture idea from expert: in review of Context service with T, I shared my concern that we are overloading this small-fast model call, because it’s not fast at all. He suggested breaking it up into several parallel calls instead, and move ahead when all are complete. It’s not more expensive to do this, the input and output tokens are same, but is faster and better architecturally as well to keep the functions separate. Eg department separate from topic.
The text was updated successfully, but these errors were encountered:
Latency and architecture idea from expert: in review of Context service with T, I shared my concern that we are overloading this small-fast model call, because it’s not fast at all. He suggested breaking it up into several parallel calls instead, and move ahead when all are complete. It’s not more expensive to do this, the input and output tokens are same, but is faster and better architecturally as well to keep the functions separate. Eg department separate from topic.
The text was updated successfully, but these errors were encountered: