AudioText documentation¶

AudioText is a CPU-first transcription service that other applications can call over HTTP. It is designed for English, Spanish, and Catalan dictation workflows, with an OpenAI-compatible sync endpoint and an async job API for product flows that need queueing.

Use these docs when you need to install the service, connect an application, operate it on a server, or work with the Python package internals.

Start with the shortest path¶

What AudioText includes¶

FastAPI service with /v1/audio/transcriptions and async job endpoints.
Scoped bearer API tokens with hashed storage.
Admin login, CSRF protection, and CIDR controls for management routes.
SQLite persistence for tokens, jobs, settings, model registry entries, audit events, and usage counters.
CLI commands for setup, tokens, models, settings, workers, health checks, and benchmarks.
Model runtime caching with unload-all and idle TTL controls.
Docker, systemd, and launchd deployment templates.

What is intentionally out of scope¶

AudioText does not ship first-party client SDK packages. Client apps should use the HTTP API and OpenAPI-compatible examples in these docs. Real streaming, speaker diarization, GPU deployment profiles, and paid-provider passthrough are later extensions, not V1 defaults.