Available · Lebanon

Backend engineer.
Applied AI.
End to end.

I build the infrastructure that makes AI systems production-ready — from ML inference pipelines and real-time streaming APIs to the distributed backends they depend on.

transcription/stream.py
@router.get("/transcribe")
async def stream_transcription(
    audio: UploadFile,
    preset: DomainPreset = DomainPreset.GENERAL,
):
    segments = await whisper.transcribe(audio)

    async def generate():
        async for token in llm.stream(
            text=segments, preset=preset
        ):
            yield f"data: {token}\n\n"

    return StreamingResponse(
        generate(),
        media_type="text/event-stream",
    )
Wassim Bannout

About

Correctness first.
Performance always.

I'm Wassim — a software engineer from Lebanon with a focus on backend systems and applied machine learning. My work lives at the intersection of reliable infrastructure and intelligent systems: building the pipelines, APIs, and automation that make AI useful in production.

I care about the engineering decisions that hold under load — async I/O patterns, fault-tolerant ML pipelines, and system design that scales without needing constant human intervention. I'm not building demos; I'm building things meant to run.

< 3s Audio → refined
text, end to end
22 Engineered ML
features (OctoPredict)
4 Automated ML
lifecycle jobs

Projects

Selected work

AI · Backend

AI Transcription Engine

Whisper transcription → LLM refinement → SSE token stream, end to end in under 3 seconds on CPU.

Full-stack speech-to-text pipeline with real-time streaming. Audio is processed through faster-Whisper, then refined token-by-token by a local LLM via Server-Sent Events — five domain presets (meetings, interviews, lectures, technical, general) shape the output. Fully offline via Ollama; swappable to any OpenAI-compatible endpoint. Single-command Docker Compose deploy.

PythonFastAPIfaster-Whisper SSETypeScriptReact 19 OllamaDocker
ML · Prediction

OctoPredict

Self-healing ML pipeline: 22 features, weekly retraining, four automated jobs — zero manual work post-deploy.

Football match prediction platform powered by XGBoost with isotonic calibration. Generates three-way win/draw/loss probabilities from 22 engineered features — Elo ratings, rolling form, head-to-head history, venue advantage. Four background jobs fully automate the ML lifecycle: fixture ingestion, result updates, model retraining, and prediction generation on schedule.

PythonXGBoostFastAPI Next.js 14aiosqliteDocker
Dev Tooling · OSS

OSS Intelligence Dashboard

Concurrent GitHub ingestion, TTL caching, and a weighted scoring model to surface trending repos in real time.

Real-time GitHub repository monitor that flags trending OSS projects through a deliberate signal-over-noise model. Repos are scored (stars × 0.5 + forks × 0.3 + issues × 0.2) and flagged as trending above 50K points. Concurrent API ingestion via ThreadPoolExecutor, TTL caching for rate-limit control, and AI-generated health summaries via OpenAI API.

PythonFlaskGitHub API ThreadPoolExecutorTTL CacheOpenAI API

Stack

Tools & Technologies

Languages
JavaPythonC JavaScriptSQLBash
Frameworks
SpringSpring Boot FlaskDjango
ML / Data
scikit-learnXGBoost PandasNumPy
Infrastructure
DockerLinuxGitGitHub

Contact

Let's work on
something worth
shipping.

Open to full-time engineering roles, technical collaborations, and hard problems. If the work is serious, so am I.