Break up context service calls #82

lisafast · 2025-01-06T15:29:26Z

Latency and architecture idea from expert: in review of Context service with T, I shared my concern that we are overloading this small-fast model call, because it’s not fast at all. He suggested breaking it up into several parallel calls instead, and move ahead when all are complete. It’s not more expensive to do this, the input and output tokens are same, but is faster and better architecturally as well to keep the functions separate. Eg department separate from topic.

ryanhyma self-assigned this Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Break up context service calls #82

Break up context service calls #82

lisafast commented Jan 6, 2025

Break up context service calls #82

Break up context service calls #82

Comments

lisafast commented Jan 6, 2025