How to use External Speech models
This guide outlines the process for using the external Google Cloud Speech-to-Text and Text-to-Speech speech-to-text and text-to-speech models within AAQ.
Prerequisite steps
Configure environment variables
To access the in-house models, ensure that the CUSTOM_TTS_ENDPOINT and CUSTOM_STT_ENDPOINT environment variables are not set (i.e. blank).
You will also need to set the GOOGLE_APPLICATION_CREDENTIALS environment variable and make sure you have the .gcp_credentials.json file so that the you can access Google Cloud Services. These should be configured in the .core_backend.env and .litellm_proxy.env files respectively (cf. Configuring AAQ).
Using External Speech Models in Deployment
To deploy external speech models, simply follow the deployment instructions in the QuickSetup. No additional steps are needed.
Setting up External Models for Development
Follow these steps to set up your development environment for external speech models.
Note: To use the Manual Setup method, you will need to add your gcp_credentials file manually in your local environment as below:
-
Place your
gcp_credentials.jsonfile inside thecore_backend/folder. -
Run
export GOOGLE_APPLICATION_CREDENTIALS="core_backend/credentials.json"to set the environment variable. -
While in the root of the directory, run
python core_backend/main.py.
Do not navigate to core_backend folder using cd core_backend
If you do this, then you will also have to adjust GOOGLE_APPLICATION_CREDENTIALS to be "/credentials.json" as it's relative to your terminal.