Tracing your AAQ calls with Langfuse
By integrating Langfuse, a popular LLM observability tool with a generous free tier, you can now track all LLM calls made via AAQ.
What's in a Trace?
With Langfuse enabled, AAQ is set up to trace each query to the POST /search
endpoint.
Each query is represented as a
Trace. If you
click on the Trace ID, you can view the details of the trace. Here is an example for a
/search
call with generate_llm_response
set to true
:
On the right, there are Generations associated with this trace. In AAQ, each generation is each call to the LiteLLM Proxy Server. You can view the series of input checks, RAG ("get_similar_content_async" and "openai/generate-response") and one output check we perform (you can learn more about our Guardrails here). The generation names come from the model names used in your LiteLLM Proxy Server Config.
Why does AAQ need observability?
As we begin piloting AAQ in various use cases, we wanted to be able to track LLM calls so that we can debug, analyze, and improve AAQ's question-answering ability. We are using it to test different prompt templates and guardrails. If you are interested in getting your hands dirty with AAQ's codebase, we imagine this will be useful to you. (Langfuse has a generous free tier and is self-hostable!)
So how do I use it?
Sign up to Langfuse, and set the following environment variables in the backend app to get started.
export LANGFUSE=True
export LANGFUSE_PUBLIC_KEY=pk-...
export LANGFUSE_SECRET_KEY=sk-...
export LANGFUSE_HOST=https://cloud.langfuse.com # optional based on your Langfuse host
See more in Config options - Tracing with Langfuse.
What's next?
We want to explore the rich set of features that Langfuse offers, such as evaluation and scoring. One concrete next step is to trace AAQ's Feedback endpoint using Langfuse's Scores.