Connect an application to AudioText¶
Applications should call AudioText from their own backend or API layer. Browser code should not hold AudioText API tokens.
Provider configuration¶
Applications usually need these settings for an AudioText provider:
| Setting | Example |
|---|---|
| Base URL | http://127.0.0.1:8791 |
| API token | att_tok_... |
| Default model | cpu-lite |
| Default language | auto, ca, es, or en |
| Preferred mode | async for normal product flows, sync for short controlled clips |
Store the token as a server-side secret. Treat transcripts and uploaded audio as sensitive data.
Sync request¶
Use sync requests only for short clips where the application is ready to wait for the response:
Recommended fields:
Async request¶
For normal product flows, create a job:
Then poll:
Use the result endpoint when the application only needs the transcript payload:
Provider smoke test¶
Run this against a local AudioText service after creating a token:
AUDIOTEXT_API_TOKEN=att_tok_... \
uv run python scripts/provider_contract_smoke.py \
--base-url http://127.0.0.1:8791 \
--model cpu-lite \
--language ca \
--list-only
Drop --list-only only when the real Faster-Whisper backend is installed and
you are ready for the selected model to load or download.
Application implementation checklist¶
- Browser records audio and uploads it to the application backend, not directly to AudioText.
- The application backend forwards the audio with a server-side provider token.
- Provider settings are write-only for secrets.
- Dictation UI can force
ca,es, orenwhen the page already knows the language. - Async jobs are preferred when clips can arrive concurrently.
- Transcript text is logged only in explicit debugging flows, never by default.