How to use in-house speech models
This guide outlines the process for hosting and using our custom in-house Speech-to-Text and Text-to-Speech models using our specialized Docker image.
Prerequisite steps
Configure environment variables
To properly set the CUSTOM_TTS_ENDPOINT
and CUSTOM_STT_ENDPOINT
environment variables, open the .core_backend.env
file and locate the lines for these variables. If they're commented out, uncomment them and ensure their values are set to the correct endpoint URLs for your in-house TTS and STT models (cf. Configuring AAQ).
Using In-house Speech Models in Deployment
To deploy in-house speech models, follow the deployment instructions in the QuickSetup with this additional step:
In "Step 5: Run docker-compose", append docker-compose.speech.yml -p
to the docker compose command as below:
docker compose -f docker-compose.yml -f docker-compose.dev.yml -f \
docker-compose.speech.yml -p aaq-stack up -d --build
Setting Up In-house Models for Development
Currently the in-house models only work with the Docker Compose Watch dev setup. Use the following command:
docker compose -f docker-compose.yml -f docker-compose.dev.yml -f \
docker-compose.speech.yml -p aaq-stack up -d --build