Skip to content

2024

Adding a model proxy server

Instead of being handled directly in our code, our model calls are now routed through a LiteLLM Proxy server. This lets us change models on the fly and have retries, fallbacks, budget tracking, and more.

Ditching Qdrant for PgVector

In our latest infrastructure update, we decided to transition from Qdrant to pgvector for managing our vector databases. This move is part of our ongoing effort to reduce cost and simplify AAQ’s architecture.

Nginx out, Caddy in

By swapping out Nginx for Caddy, we substantially simplified the deployment steps and the architecture - which means fewer docker containers to run and manage.

No more hallucinations

Last week we rolled out another safety feature - checking consistency of the response from the LLM with the content it is meant to be using to generate it. This shoud catch hallucinations or when LLM uses it's pre-training to answer a question. But it also catches any prompt injection or jailbreaking - if it somehow got through our other checks.

Improved docs!

First, we have added this section that you are currently reading. Each week we'll post what we've rolled out - new features, bug fixes, and performance improvements.

The rest of the docs have now also been restructured to make it easy to parse.