AI Transcription Engine
Whisper transcription → LLM refinement → SSE token stream, end to end in under 3 seconds on CPU.
Full-stack speech-to-text pipeline with real-time streaming. Audio is processed through faster-Whisper, then refined token-by-token by a local LLM via Server-Sent Events — five domain presets (meetings, interviews, lectures, technical, general) shape the output. Fully offline via Ollama; swappable to any OpenAI-compatible endpoint. Single-command Docker Compose deploy.