AudioText documentation¶
AudioText is a CPU-first transcription service that other applications can call over HTTP. It is designed for English, Spanish, and Catalan dictation workflows, with an OpenAI-compatible sync endpoint and an async job API for product flows that need queueing.
Use these docs when you need to install the service, connect an application, operate it on a server, or work with the Python package internals.
Start with the shortest path¶
- Install and run locally
- Deploy with Docker
- Publish the docs to the VPS
- Call the HTTP API
- Operate the service
- Choose a model
- Integrate an application
- Read the Python package reference
What AudioText includes¶
- FastAPI service with
/v1/audio/transcriptionsand async job endpoints. - Scoped bearer API tokens with hashed storage.
- Admin login, CSRF protection, and CIDR controls for management routes.
- SQLite persistence for tokens, jobs, settings, model registry entries, audit events, and usage counters.
- CLI commands for setup, tokens, models, settings, workers, health checks, and benchmarks.
- Model runtime caching with unload-all and idle TTL controls.
- Docker, systemd, and launchd deployment templates.
What is intentionally out of scope¶
AudioText does not ship first-party client SDK packages. Client apps should use the HTTP API and OpenAPI-compatible examples in these docs. Real streaming, speaker diarization, GPU deployment profiles, and paid-provider passthrough are later extensions, not V1 defaults.