-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New UI for Observe #22
Comments
Crew AI ObservabilityAgentOps.aicrewAI - agentops-observability => AgentOps.ai => (repo), default UI data, Node sdk. Set via
cons:
prons:
LangtraceAgent Monitoring with Langtrace => Langtrace, (repo), Evalution docs page = but the UI is very naive.
cons:
props:
OpenLIT
They have an auto-evaluation task. But it is not implemented yet. portkey
Some base guardrails are implemented in the UI (see docs), but for custom once, they only provide the webhook solution. This is the only library, that is set in the LLM provider class. ------------------------ bonus ------------------------------------- Langfuse
cons:
props:
|
I created evaluation-observe integration proposal as a separate issue: |
Observe UI
Current state
We use the mlflow as a UI tool for traces right now. This approach has several limitations I will describe.
The motivation is to have software that will eliminate these limitations.
mlflow limitations
Some things limit us right now.
UI solutions
bee-ui included ⛔
This solution creates the dependency between the framework and the bee-ui and it's inefficiently complicated.
New UI app ✅
This is a better solution from my point of view. We can create a very simple app that will be part of the bee-stack and bee-agent-framework-starter.
This application will have only one dependency on the bee-observer (API server).
Features
Trace list
The paginated table with traces and base information about each one.
Trace detail
The page detail data with the dependency tree and some picked data, that are important for us for quick debugging.
The picked data for the trace execution:
The picked data for each iteration:
Features v2
module_usage
metric. We would also need to add the metric route to our Obaserve OTLP backend.Features V3 (Evaluation)
When the Observe accepts the trace and saves it to the database, it calls the BullMQ
evaluation
job.The Python service will accept the
traceId
from BullMQ and will callinference
to get the evaluation metrics. Then, The service saves the evaluation metrics to the trace entityImplementation:
judge
type without the expected answer. (Only the static list, could be hardcoded in the code for the first evaluation version.patch /v1/traces/${traceId}
route that will accept the list of evaluation metrics. TODO: specify the formatevaluation
job. This job will be called automatically when the trace is created.bee-api
dependency)bee-agent-framework-starter
andbee-stack
Evalutation (part 2)
Inspiration and opportunities
Depending on this analysis I make the summary of what would be the right way for us.
more universal solution
- we should stay connected with our framework solution and not make some universal tool for observability. Because when we stay to support only the defined data frame, we can work with the data more efficiently and visualize them well-arranged for the user.The text was updated successfully, but these errors were encountered: