Deploy AudioText with Docker

Use this quick start when you want a small server running AudioText with Docker Compose, SQLite, local persistent storage, and one API token for a client application.

The default Compose file binds the service to 127.0.0.1:8791. That is the safe default for a server behind Caddy, nginx, SSH tunneling, or a private network route.

1. Clone the repository

Run on the server:

git clone <private-repo-url> audio-to-text
cd audio-to-text

2. Create the environment file

Run:

cp deploy/audiotext.env.example .env

Generate secrets:

openssl rand -hex 32
openssl rand -hex 32

Edit .env and set at least:

AUDIOTEXT_ENV=production
AUDIOTEXT_PUBLIC_BASE_URL=https://transcription.example.com
AUDIOTEXT_TOKEN_PEPPER=<first generated value>
AUDIOTEXT_ADMIN_SESSION_SECRET=<second generated value>
AUDIOTEXT_ADMIN_CIDR_ALLOWLIST=127.0.0.1/32,::1/128

Keep AUDIOTEXT_ADMIN_CIDR_ALLOWLIST local-only unless the admin UI is behind a trusted private network or reverse proxy rule. The client API can still be exposed separately by the reverse proxy.

For the first simple deployment, keep:

AUDIOTEXT_ASYNC_JOB_RUNNER=background
AUDIOTEXT_MAX_CONCURRENT_TRANSCRIPTIONS=1
AUDIOTEXT_MAX_LOADED_MODELS=1
AUDIOTEXT_MODEL_IDLE_TTL_SECONDS=900

3. Start the service

Run:

docker compose up --build -d

Check the container:

docker compose ps
docker compose logs -f audiotext

Check the HTTP endpoints:

curl -fsS http://127.0.0.1:8791/healthz
curl -fsS http://127.0.0.1:8791/readyz

4. Create the admin user

Run:

docker compose exec audiotext audiotext admin create-user --email admin@example.com

The command prompts for the password. Do not pass the password on the command line on a shared server.

Open the admin UI from the server or through a private route:

http://127.0.0.1:8791/admin

5. Create an API token

Run:

docker compose exec audiotext audiotext token create \
  --name dictation-client-prod \
  --max-open-uploads 2 \
  --daily-audio-seconds-quota 7200 \
  --monthly-audio-seconds-quota 120000

The command prints the raw token once. Store it in the client application's server-side secret store.

For shell testing:

export AUDIOTEXT_API_TOKEN="att_tok_..."

6. Test the client API

List models:

curl -sS http://127.0.0.1:8791/v1/models \
  -H "Authorization: Bearer $AUDIOTEXT_API_TOKEN"

Submit an async job:

curl -sS http://127.0.0.1:8791/v1/transcription-jobs \
  -H "Authorization: Bearer $AUDIOTEXT_API_TOKEN" \
  -F model=cpu-lite \
  -F language=ca \
  -F file=@clip.webm

Poll the returned job:

curl -sS http://127.0.0.1:8791/v1/transcription-jobs/$JOB_ID \
  -H "Authorization: Bearer $AUDIOTEXT_API_TOKEN"

The first real transcription can be slower because the selected model may need to download and load. After that, the model stays in memory until the cache is full or the idle TTL unloads it.

7. Connect a client application

Use these values in the client application's server-side configuration:

Setting Value
Base URL http://127.0.0.1:8791 from the same host, or the reverse-proxy URL
API token the att_tok_... value printed once
Default model cpu-lite first, then benchmark cpu-turbo
Default language force ca, es, or en when the app knows it; otherwise auto
Preferred mode async jobs for normal product flows

Do not put the API token in browser code.

8. Stop, update, or reset

Stop:

docker compose down

Update:

git pull
docker compose up --build -d

The SQLite database, uploads, and cached runtime data live in the Docker volume audiotext-data.

Remove the local deployment data only when you deliberately want a clean reset:

docker compose down -v

9. Add a separate worker later

For a busier server, switch queued jobs to a separate worker process:

AUDIOTEXT_ASYNC_JOB_RUNNER=external docker compose --profile worker up --build -d

Keep the API and worker on the same Docker volume so they share SQLite and uploaded audio files.