Wassim Bannout — Software Engineer

Backend engineer.
Applied AI.
End to end.

I build the infrastructure that makes AI systems production-ready — from ML inference pipelines and real-time streaming APIs to the distributed backends they depend on.

@router.get("/transcribe") async def stream_transcription( audio: UploadFile, preset: DomainPreset = DomainPreset.GENERAL, ): segments = await whisper.transcribe(audio) async def generate(): async for token in llm.stream( text=segments, preset=preset ): yield f"data: {token}\n\n" return StreamingResponse( generate(), media_type="text/event-stream", )

Correctness first.
Performance always.

I'm Wassim — a software engineer from Lebanon with a focus on backend systems and applied machine learning. My work lives at the intersection of reliable infrastructure and intelligent systems: building the pipelines, APIs, and automation that make AI useful in production.

I care about the engineering decisions that hold under load — async I/O patterns, fault-tolerant ML pipelines, and system design that scales without needing constant human intervention. I'm not building demos; I'm building things meant to run.

< 3s Audio → refined
text, end to end

22 Engineered ML
features (OctoPredict)

4 Automated ML
lifecycle jobs

AI · Backend ↗

AI Transcription Engine

Whisper transcription → LLM refinement → SSE token stream, end to end in under 3 seconds on CPU.

Full-stack speech-to-text pipeline with real-time streaming. Audio is processed through faster-Whisper, then refined token-by-token by a local LLM via Server-Sent Events — five domain presets (meetings, interviews, lectures, technical, general) shape the output. Fully offline via Ollama; swappable to any OpenAI-compatible endpoint. Single-command Docker Compose deploy.

PythonFastAPIfaster-Whisper SSETypeScriptReact 19 OllamaDocker

ML · Prediction ↗

OctoPredict

Self-healing ML pipeline: 22 features, weekly retraining, four automated jobs — zero manual work post-deploy.

Football match prediction platform powered by XGBoost with isotonic calibration. Generates three-way win/draw/loss probabilities from 22 engineered features — Elo ratings, rolling form, head-to-head history, venue advantage. Four background jobs fully automate the ML lifecycle: fixture ingestion, result updates, model retraining, and prediction generation on schedule.

PythonXGBoostFastAPI Next.js 14aiosqliteDocker

Dev Tooling · OSS ↗

OSS Intelligence Dashboard

Concurrent GitHub ingestion, TTL caching, and a weighted scoring model to surface trending repos in real time.

Real-time GitHub repository monitor that flags trending OSS projects through a deliberate signal-over-noise model. Repos are scored (stars × 0.5 + forks × 0.3 + issues × 0.2) and flagged as trending above 50K points. Concurrent API ingestion via ThreadPoolExecutor, TTL caching for rate-limit control, and AI-generated health summaries via OpenAI API.

PythonFlaskGitHub API ThreadPoolExecutorTTL CacheOpenAI API

Backend engineer.
Applied AI.
End to end.

Correctness first.
Performance always.

Selected work

AI Transcription Engine

OctoPredict

OSS Intelligence Dashboard

Tools & Technologies

Let's work on
something worth
shipping.

Backend engineer. Applied AI. End to end.

Correctness first.Performance always.

Selected work

AI Transcription Engine

OctoPredict

OSS Intelligence Dashboard

Tools & Technologies

Let's work on something worth shipping.

Backend engineer.
Applied AI.
End to end.

Correctness first.
Performance always.

Let's work on
something worth
shipping.