How to use Hugging Face embeddings

To host Hugging Face embeddings we use text-embeddings-inference image by Hugging Face.

Prerequisite steps

Step 0. Update your LiteLLM Proxy config

To use Hugging Face embeddings instead of OpenAI embeddings, you can replace OpenAI embeddings in litellm_proxy_config.yaml.

This can be done by uncommenting the second embeddings model:

# - model_name: embeddings
#   litellm_params:
#     model: huggingface/huggingface-embeddings # model name not important
#     api_key: "os.environ/HUGGINGFACE_EMBEDDINGS_API_KEY" #pragma: allowlist secret
#     api_base: "os.environ/HUGGINGFACE_EMBEDDINGS_ENDPOINT"

The first embeddings model should be commented out unless using Hugging Face embeddings as a back up to OpenAI embeddings.

Step 1. Set `PGVECTOR_VECTOR_SIZE` environment variable

Make sure that PGVECTOR_VECTOR_SIZE is set to be the vector size generated by your Hugging Face embedding of choice. This should be set in .core_backend.env (cf. Configuring AAQ).

Note that if the database is already set up using a different PGVECTOR_VECTOR_SIZE value, this will not work unless the database is destroyed and created again.

Deploying Hugging Face Embeddings

Make sure you've performed the prerequisite steps before proceeding.

To deploy Hugging Face embeddings, follow the deployment instructions in Quick Setup with the following additional steps

On Step 4: Configure LiteLLM Proxy server, edit .litellm_proxy.env by setting the following variables:

HUGGINGFACE_MODEL: your Hugging Face model of choice.
HUGGINGFACE_EMBEDDINGS_API_KEY: API key for Hugging Face Embeddings API
HUGGINGFACE_EMBEDDINGS_ENDPOINT: API endpoint URL for Hugging Face Embeddings container. The default value should work with docker compose.

If using an arm64 device, a docker image should be built locally before deployment.

This can be done by running the make command: make build-embeddings-arm. Also, the variable EMBEDDINGS_IMAGE_NAME should be uncommented in .core_backend.env.

On Step 6: Run docker-compose, add --profile huggingface-embeddings to the docker compose command:

docker compose -f docker-compose.yml -f docker-compose.dev.yml \
    --profile huggingface-embeddings -p aaq-stack up -d --build

Setting up Hugging Face embeddings for development

Make sure you've performed the prerequisite steps before proceeding.

To set up your development environment with Hugging Face embeddings, you can start the container manually by navigating to ask-a-question repository root and executing the following make command:

make setup-embeddings

If you are using an arm device, you can first build the image using:

make build-embeddings-arm

then:

make setup-embeddings-arm

Before running the commands above, you must export environment variables HUGGINGFACE_MODEL and HUGGINGFACE_EMBEDDINGS_API_KEY.

The embeddings API endpoint by default is at: http://localhost:5000.

How to use Hugging Face embeddings

Prerequisite steps

Step 0. Update your LiteLLM Proxy config

Step 1. Set PGVECTOR_VECTOR_SIZE environment variable

Deploying Hugging Face Embeddings

Setting up Hugging Face embeddings for development

Also see

Step 1. Set `PGVECTOR_VECTOR_SIZE` environment variable