Connect an application to AudioText¶

Applications should call AudioText from their own backend or API layer. Browser code should not hold AudioText API tokens.

Provider configuration¶

Applications usually need these settings for an AudioText provider:

Setting	Example
Base URL	`http://127.0.0.1:8791`
API token	`att_tok_...`
Default model	`cpu-lite`
Default language	`auto`, `ca`, `es`, or `en`
Preferred mode	`async` for normal product flows, `sync` for short controlled clips

Store the token as a server-side secret. Treat transcripts and uploaded audio as sensitive data.

Sync request¶

Use sync requests only for short clips where the application is ready to wait for the response:

POST /v1/audio/transcriptions
Authorization: Bearer <token>
Content-Type: multipart/form-data

Recommended fields:

model=cpu-lite
language=ca
file=@clip.webm

Async request¶

For normal product flows, create a job:

POST /v1/transcription-jobs
Authorization: Bearer <token>
Content-Type: multipart/form-data

Then poll:

GET /v1/transcription-jobs/{job_id}
Authorization: Bearer <token>

Use the result endpoint when the application only needs the transcript payload:

GET /v1/transcription-jobs/{job_id}/result
Authorization: Bearer <token>

Provider smoke test¶

Run this against a local AudioText service after creating a token:

AUDIOTEXT_API_TOKEN=att_tok_... \
  uv run python scripts/provider_contract_smoke.py \
  --base-url http://127.0.0.1:8791 \
  --model cpu-lite \
  --language ca \
  --list-only

Drop --list-only only when the real Faster-Whisper backend is installed and you are ready for the selected model to load or download.

Application implementation checklist¶

Browser records audio and uploads it to the application backend, not directly to AudioText.
The application backend forwards the audio with a server-side provider token.
Provider settings are write-only for secrets.
Dictation UI can force ca, es, or en when the page already knows the language.
Async jobs are preferred when clips can arrive concurrently.
Transcript text is logged only in explicit debugging flows, never by default.