Benchmark AudioText

Use benchmarks to compare models and tuning settings after models are already downloaded and loaded. Do not include private audio clips in the repository.

Keep local audio out of git

Put private clips under a local ignored directory:

mkdir -p audio/benchmarks
cp ~/Downloads/clip-1.opus audio/benchmarks/
cp ~/Downloads/clip-2.opus audio/benchmarks/

The repository ignores common audio file extensions and local audio folders.

Run the benchmark CLI

Preload the model first if you want to exclude first-load time from the transcription run:

uv run audiotext models preload cpu-lite
uv run audiotext benchmark run \
  --audio audio/benchmarks/clip-1.opus \
  --audio audio/benchmarks/clip-2.opus \
  --models cpu-lite,cpu-turbo \
  --languages auto,ca,es,en \
  --beam-sizes 1,5 \
  --cpu-threads 0,2,4 \
  --num-workers 1 \
  --vad-modes true,false \
  --output markdown

The command reuses one model manager for the whole run, so repeated cases for the same runtime show cache_hit=true after the first load. It records wall time, CPU seconds, approximate CPU percent, model load time, transcription time, peak RSS, cache-hit state, and a transcript preview. Use --output json when you want machine-readable results.

Compare speed settings

For the API, compare:

  • forced language=ca, language=es, or language=en against auto,
  • beam_size=1 against larger beams,
  • vad_filter=true against vad_filter=false,
  • cpu_threads=0 against fixed thread counts,
  • num_workers=1 against larger worker counts for concurrent service loads.

Treat model quality as a product benchmark, not just a speed benchmark. Keep a small labelled set of representative clips for manual review or future WER scoring, but do not commit audio or transcripts that contain sensitive data.