How to use Hugging Face embeddings
To host Hugging Face embeddings we use text-embeddings-inference image by Hugging Face.
Prerequisite steps
Step 0. Update your LiteLLM Proxy config
To use Hugging Face embeddings instead of OpenAI embeddings, you can replace OpenAI
embeddings in litellm_proxy_config.yaml
.
This can be done by uncommenting the second embeddings model:
# - model_name: embeddings
# litellm_params:
# model: huggingface/huggingface-embeddings # model name not important
# api_key: "os.environ/HUGGINGFACE_EMBEDDINGS_API_KEY" #pragma: allowlist secret
# api_base: "os.environ/HUGGINGFACE_EMBEDDINGS_ENDPOINT"
The first embeddings model should be commented out unless using Hugging Face embeddings as a back up to OpenAI embeddings.
Step 1. Set PGVECTOR_VECTOR_SIZE
environment variable
Make sure that PGVECTOR_VECTOR_SIZE
is set to be the vector size generated by your
Hugging Face embedding of choice. This should be set in .core_backend.env
(cf.
Configuring AAQ).
Note that if the database is already set up using a different PGVECTOR_VECTOR_SIZE
value, this will not work unless the database is destroyed and created again.
Deploying Hugging Face Embeddings
Make sure you've performed the prerequisite steps before proceeding.
To deploy Hugging Face embeddings, follow the deployment instructions in Quick Setup with the following additional steps
On Step 4: Configure LiteLLM Proxy server, edit .litellm_proxy.env
by setting the following variables:
HUGGINGFACE_MODEL
: your Hugging Face model of choice.HUGGINGFACE_EMBEDDINGS_API_KEY
: API key for Hugging Face Embeddings APIHUGGINGFACE_EMBEDDINGS_ENDPOINT
: API endpoint URL for Hugging Face Embeddings container. The default value should work with docker compose.
If using an arm64 device, a docker image should be built locally before deployment.
This can be done by running the make command: make build-embeddings-arm
. Also,
the variable EMBEDDINGS_IMAGE_NAME
should be uncommented in .core_backend.env
.
On Step 6: Run docker-compose, add --profile huggingface-embeddings
to the
docker compose command:
docker compose -f docker-compose.yml -f docker-compose.dev.yml \
--profile huggingface-embeddings -p aaq-stack up -d --build
Setting up Hugging Face embeddings for development
Make sure you've performed the prerequisite steps before proceeding.
To set up your development environment with Hugging Face embeddings, you can start the container
manually by navigating to ask-a-question
repository root and executing the following make command:
If you are using an arm device, you can first build the image using:
then:
Before running the commands above, you must export environment variables HUGGINGFACE_MODEL
and HUGGINGFACE_EMBEDDINGS_API_KEY
.
The embeddings API endpoint by default is at: http://localhost:5000.