Deploy AudioText with Docker¶

Use this quick start when you want a small server running AudioText with Docker Compose, SQLite, local persistent storage, and one API token for a client application.

The default Compose file binds the service to 127.0.0.1:8791. That is the safe default for a server behind Caddy, nginx, SSH tunneling, or a private network route.

1. Clone the repository¶

Run on the server:

git clone <private-repo-url> audio-to-text
cd audio-to-text

2. Create the environment file¶

Run:

cp deploy/audiotext.env.example .env

Generate secrets:

openssl rand -hex 32
openssl rand -hex 32

Edit .env and set at least:

AUDIOTEXT_ENV=production
AUDIOTEXT_PUBLIC_BASE_URL=https://transcription.example.com
AUDIOTEXT_TOKEN_PEPPER=<first generated value>
AUDIOTEXT_ADMIN_SESSION_SECRET=<second generated value>
AUDIOTEXT_ADMIN_CIDR_ALLOWLIST=127.0.0.1/32,::1/128

Keep AUDIOTEXT_ADMIN_CIDR_ALLOWLIST local-only unless the admin UI is behind a trusted private network or reverse proxy rule. The client API can still be exposed separately by the reverse proxy.

For the first simple deployment, keep:

AUDIOTEXT_ASYNC_JOB_RUNNER=background
AUDIOTEXT_MAX_CONCURRENT_TRANSCRIPTIONS=1
AUDIOTEXT_MAX_LOADED_MODELS=1
AUDIOTEXT_MODEL_IDLE_TTL_SECONDS=900

3. Start the service¶

Run:

docker compose up --build -d

Check the container:

docker compose ps
docker compose logs -f audiotext

Check the HTTP endpoints:

curl -fsS http://127.0.0.1:8791/healthz
curl -fsS http://127.0.0.1:8791/readyz

4. Create the admin user¶

Run:

docker compose exec audiotext audiotext admin create-user --email admin@example.com

The command prompts for the password. Do not pass the password on the command line on a shared server.

Open the admin UI from the server or through a private route:

http://127.0.0.1:8791/admin

5. Create an API token¶

Run:

docker compose exec audiotext audiotext token create \
  --name dictation-client-prod \
  --max-open-uploads 2 \
  --daily-audio-seconds-quota 7200 \
  --monthly-audio-seconds-quota 120000

The command prints the raw token once. Store it in the client application's server-side secret store.

For shell testing:

export AUDIOTEXT_API_TOKEN="att_tok_..."

6. Test the client API¶

List models:

curl -sS http://127.0.0.1:8791/v1/models \
  -H "Authorization: Bearer $AUDIOTEXT_API_TOKEN"

Submit an async job:

curl -sS http://127.0.0.1:8791/v1/transcription-jobs \
  -H "Authorization: Bearer $AUDIOTEXT_API_TOKEN" \
  -F model=cpu-lite \
  -F language=ca \
  -F file=@clip.webm

Poll the returned job:

curl -sS http://127.0.0.1:8791/v1/transcription-jobs/$JOB_ID \
  -H "Authorization: Bearer $AUDIOTEXT_API_TOKEN"

The first real transcription can be slower because the selected model may need to download and load. After that, the model stays in memory until the cache is full or the idle TTL unloads it.

7. Connect a client application¶

Use these values in the client application's server-side configuration:

Setting	Value
Base URL	`http://127.0.0.1:8791` from the same host, or the reverse-proxy URL
API token	the `att_tok_...` value printed once
Default model	`cpu-lite` first, then benchmark `cpu-turbo`
Default language	force `ca`, `es`, or `en` when the app knows it; otherwise `auto`
Preferred mode	async jobs for normal product flows

Do not put the API token in browser code.

8. Stop, update, or reset¶

Stop:

docker compose down

Update:

git pull
docker compose up --build -d

The SQLite database, uploads, and cached runtime data live in the Docker volume audiotext-data.

Remove the local deployment data only when you deliberately want a clean reset:

docker compose down -v

9. Add a separate worker later¶

For a busier server, switch queued jobs to a separate worker process:

AUDIOTEXT_ASYNC_JOB_RUNNER=external docker compose --profile worker up --build -d

Keep the API and worker on the same Docker volume so they share SQLite and uploaded audio files.