-
Notifications
You must be signed in to change notification settings - Fork 70
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into steffen/exclude-prereleases-from-api
- Loading branch information
Showing
12 changed files
with
8,412 additions
and
8,170 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
name: "CI" | ||
|
||
on: | ||
pull_request: | ||
merge_group: | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
with: | ||
fetch-depth: 0 | ||
|
||
- uses: pnpm/action-setup@v4 | ||
id: pnpm-install | ||
with: | ||
version: 9.5.0 | ||
run_install: false | ||
|
||
- uses: actions/setup-node@v4 | ||
with: | ||
node-version: 20 | ||
|
||
- run: pnpm install | ||
|
||
- name: Build next.js app | ||
# change this if your site requires a custom build command | ||
run: pnpm build |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
168 changes: 168 additions & 0 deletions
168
pages/blog/2024-07-ai-agent-observability-with-langfuse.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
--- | ||
title: "AI Agent Observability with Langfuse" | ||
date: 2024/07/31 | ||
description: Learn about agents and the importance of monitoring and tracking performance, cost, and user interactions. Explore tools like LangGraph, Llama Agents, Dify, Flowise, and Langflow, and see how Langfuse helps to trace and optimize your application. | ||
ogImage: /images/blog/ai-agent-observability/ai-agent-observability.png | ||
tag: showcase | ||
author: Jannik | ||
--- | ||
|
||
import { BlogHeader } from "@/components/blog/BlogHeader"; | ||
|
||
<BlogHeader | ||
title="AI Agent Observability with Langfuse" | ||
description="Easily monitor, trace and debug your AI agents. Explore tools like LangGraph, Llama Agents, Dify, Flowise, and Langflow, and see how Langfuse helps to monitor and optimize your application." | ||
date="July 31, 2024" | ||
authors={["jannikmaierhoefer"]} | ||
/> | ||
|
||
## What are AI Agents? | ||
|
||
An AI agent is a system that autonomously performs tasks by planning its task execution and utilizing available tools. AI Agents leverage large language models (LLMs) to understand and respond to user inputs step-by-step and decide when to call external tools. | ||
|
||
**To solve tasks, agents use:** | ||
|
||
- **planning** by devising step-by-step actions from the given task | ||
- **tools** to extend their capabilities like RAG, external APIs, or code interpretation/execution | ||
- **memory** to store and recall past interactions for additional contextual information | ||
|
||
## What are Agents Used For? | ||
|
||
**Common use cases include:** | ||
|
||
- **Customer Support:** AI agents use RAG to automate responses, autonomously take action and efficiently handle inquiries with accurate information. | ||
- **Market Research**: Agents collect and synthesize information from various sources, delivering accurate and concise summaries to users. | ||
- **Software Development:** AI agents break coding tasks into smaller sub-tasks and then recombine them to create a complete solution. | ||
|
||
## Design Patterns of AI Agents | ||
|
||
An AI agent usually consists of 5 parts: A language model with general-purpose capabilities that serves as the main brain or **coordinator**, and four sub-modules: a **planning module** to divide the task into smaller steps, an **action module** that enables the agent to use external tools, a **memory module** to store and recall past interactions and a **profile module**, to describe the behavior of the agent. | ||
|
||
<div className="max-w-sm my-10"> | ||
![AI Agent Design](/images/blog/ai-agent-observability/agent-design.png) | ||
</div> | ||
|
||
In **single-agent setups**, one agent is responsible for solving the entire task autonomously. In **multi-agent setups**, multiple specialized agents collaborate, each handling different aspects of the task to achieve a common goal more efficiently. These agents are also often referred to as state-based or stateful agents as they route the task through different states. | ||
|
||
<div className="my-10"> | ||
![Single and Multi Agent | ||
Designs](/images/blog/ai-agent-observability/multi-agent.png) | ||
</div> | ||
|
||
## What is AI Agent Observability? | ||
|
||
Observing agents means tracking and analyzing the performance, behavior, and interactions of AI agents. This includes real-time monitoring of multiple LLM calls, control flows, decision-making processes, and outputs to ensure agents operate efficiently and accurately. | ||
|
||
[Langfuse](/) is an open-source LLM engineering platform that provides deep insights into metrics such as latency, cost, and error rates, enabling developers to debug, optimize, and enhance their AI systems. Using Langfuse observability, teams can identify and resolve issues, streamline workflows, and maintain high-quality outputs by evaluating agent responses in complex, multi-step AI agents. | ||
|
||
### Why AI Agent Observability is Important | ||
|
||
#### Debugging and Edge Cases | ||
|
||
Agents use multiple steps to solve complex tasks, and inaccurate intermediary results can cause failures of the entire system. [Tracing](/docs/tracing) these intermediate steps and testing your application on known edge cases is essential. | ||
|
||
When deploying LLMs, some edge cases will always slip through in initial testing. A proper analytics set-up helps identify these cases, allowing you to add them to future test sets for more robust agent evaluations. With [Datasets](/docs/datasets/overview), Langfuse allows you to collect examples of inputs and expected outputs to benchmark new releases before deployment. Datasets can be incrementally updated with new edge cases found in production and integrated with existing CI/CD pipelines. | ||
|
||
#### Tradeoff of Accuracy and Costs | ||
|
||
LLMs are stochastic by nature, meaning they are a statistical process that can produce errors or hallucinations. Calling language models multiple times while selecting the best or most common answer can increase accuracy. This can be a major advantage of using agentic workflows. | ||
|
||
However, this comes with a cost. The tradeoff between accuracy and costs in LLM-based agents is crucial, as higher accuracy often leads to increased operational expenses. Often, the agent decides autonomously how many LLM calls or paid external API calls it needs to make to solve a task, potentially leading to high costs for single-task executions. Therefore, it is important to monitor model usage and costs in real-time. | ||
|
||
Langfuse monitors both [costs](/docs/model-usage-and-cost) and [accuracy](/docs/scores/overview), enabling you to optimize your application for production. | ||
|
||
#### Understanding User Interactions | ||
|
||
AI agents analytics allows you to capture how users interact with your LLM applications. This information is crucial for refining your AI application and tailoring responses to better meet user needs. | ||
|
||
[Langfuse Analytics](/docs/analytics/overview) derives insights from production data, helping you measure quality through user feedback and model-based [scoring](/docs/scores/overview) over time and across different versions. It also allows you to monitor cost and latency metrics in real-time, broken down by user, session, geography, and model version, enabling precise optimizations for your LLM application. | ||
|
||
## Tools to build AI Agents | ||
|
||
You do not need any specific tools to build AI agents. However, there are several open-source frameworks that can help you build complex, stateful, multi-agent applications. | ||
|
||
### Application Frameworks | ||
|
||
#### LangGraph | ||
|
||
LangGraph ([GitHub](https://github.com/langchain-ai/langgraph)) is an open-source framework by the LangChain team for building complex, stateful, multi-agent applications. LangGraph includes built-in persistence to save and resume state, which enables error recovery and human-in-the-loop workflows. | ||
|
||
LangGraph agents can be [monitored with Langfuse](/docs/integrations/langchain/example-python-langgraph) to observe and debug the steps of an agent. | ||
|
||
<div className="mt-10"> | ||
![LangGraph Example](/images/blog/ai-agent-observability/langgraph.png) | ||
</div> | ||
|
||
<span> | ||
_Example of an agent setup created with LangGraph and visualized with Mermaid, | ||
featuring two executing agents: One research agent and one agent to tell the | ||
current time._ | ||
</span> | ||
|
||
<div className="mt-10"> | ||
![Example Trace in | ||
Langfuse](/images/blog/ai-agent-observability/example-trace-in-langfuse.png) | ||
</div> | ||
|
||
<span> | ||
_Example traces in Langfuse ([view | ||
trace](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/a8b0cc9e-da3b-485f-a642-35431a6f9289))_ | ||
</span> | ||
|
||
#### Llama Agents | ||
|
||
Llama Agents ([GitHub](https://github.com/run-llama/llama-agents)) is an open-source framework designed to simplify the process of building, iterating, and deploying multi-agent AI systems and turn your agents into production microservices. | ||
|
||
Langfuse offers a simple [integration](/docs/integrations/llama-index/get-started) for automatic capture of traces and metrics generated in LlamaIndex applications. | ||
|
||
<div className="mt-10"> | ||
![Llama Agent Example](/images/blog/ai-agent-observability/llama-agent.png) | ||
</div> | ||
|
||
<span>_Llama agent system layout._</span> | ||
|
||
### No-code Agent Builders | ||
|
||
For prototypes and development by non-developers, no-code builders can be a great starting point. | ||
|
||
#### Flowise | ||
|
||
Flowise ([GitHub](https://github.com/FlowiseAI/Flowise)) is a no-code builder. It lets you build customized LLM flows with a drag-and-drop editor. With the native Langfuse [integration](/docs/integrations/flowise), you can use Flowise to quickly create complex LLM applications in no-code and then use Langfuse to analyse and improve them. | ||
|
||
<Frame fullWidth border className="my-10"> | ||
![Flowise Example](/images/blog/ai-agent-observability/flowise.jpg) | ||
</Frame> | ||
<span> | ||
_Example of a catalog chatbot created in Flowise to answer any questions | ||
related to shop products._ | ||
</span> | ||
|
||
#### Langflow | ||
|
||
Langflow ([GitHub](https://github.com/logspace-ai/langflow)) is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows. | ||
|
||
With the native [integration](/docs/integrations/langflow), you can use Langflow to quickly create complex LLM applications in no code and then use Langfuse to monitor and debug them. | ||
|
||
<Frame fullWidth border className="my-10"> | ||
![Langflow Example](/images/blog/ai-agent-observability/langflow.jpg) | ||
</Frame> | ||
<span> | ||
_Example of a chat agent with chain-of-thought reasoning built in Langflow by | ||
[Cobus Greyling](https://cobusgreyling.medium.com)._ | ||
</span> | ||
|
||
#### Dify | ||
|
||
Dify ([GitHub](https://github.com/langgenius/dify)) is an open-source LLM app development platform. Using their Agent Builder nd variety of templates, you can easily build an AI agent and then grow it into a more complex system via Dify workflows. | ||
|
||
With the native Langfuse [integration](/docs/integrations/dify), you can use Dify to quickly create complex LLM applications and then use Langfuse to monitor and improve them. | ||
|
||
<Frame border fullWidth> | ||
![Dify Example](/images/blog/ai-agent-observability/dify.jpg) | ||
</Frame> | ||
|
||
<span>_Example of a Dify Agent that summarizes meetings._</span> | ||
|
||
## Get Started | ||
|
||
If you want to get started with building your AI Agents and monitoring them with Langfuse, check out our [end-to-end example](/docs/integrations/langchain/example-python-langgraph) of building a simple agent with LangGraph and tracking it with Langfuse. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.