Tarek Masryo tarekmasryo

AI/ML Engineer building production-ready ML and Generative AI systems across modeling, serving, evaluation, monitoring, and decision support.
From validated data pipelines → model evaluation → deployed APIs → decision-support systems.

🧭 What I build

Area	What it means in practice
Production ML systems	Leakage-safe pipelines, calibrated outputs, threshold policies, reproducible artifacts, and decision-ready outputs
Generative AI & RAG	Retrieval attribution, hallucination exposure, tool-calling agents, structured outputs, and quality checks before rollout
APIs & serving	Dockerized FastAPI services, strict schemas, versioned artifacts, and CI-friendly delivery
Monitoring & ops	Telemetry, drift signals, cost/latency trade-offs, triage thresholds, and operator-facing workflows with clear handoff logic
Applied NLP & CV	Text classification, semantic search, threshold tuning, explainability, image restoration, and practical computer vision apps

🌟 Featured work

Project	What it demonstrates
Fraud Risk Ops Platform	FastAPI inference, Streamlit analyst UI, calibrated risk scores, threshold policies, audit logs, batch jobs, and monitoring hooks
RAG QA Command Center	Retrieval quality evaluation, hallucination exposure, trace review, configuration trade-offs, and policy simulation
LLMOps Telemetry Command Center	Reliability, latency, cost, routing-policy review, drift signals, triage thresholds, and evidence exports
Advanced ML Sentiment Lab	TF-IDF pipelines, ROC/PR evaluation, threshold tuning, error review, live prediction, and exportable artifacts
Health Intelligence Platform	Behavioral risk analytics, cohort KPIs, threshold diagnostics, feature importance, trends, and scenario simulation
Old Photo Restorer	Gradio computer vision app with GFPGAN restoration, optional upscaling, before/after preview, and batch ZIP export

📊 Analytics & decision apps

Project	Focus
Short-Video Intelligence Dashboard	Virality scoring, engagement metrics, creator leaderboards, timing patterns, and segment benchmarks
EV Charging Analytics	Geospatial infrastructure analytics, fast-DC allocation scenarios, market slices, and network planning
Football Matches Dashboard	European football and UCL analytics: KPIs, standings, team explorer, head-to-head, and interactive match tables
Seaborn & Matplotlib Visual Lab	Interactive Streamlit lab to build, compare, and export Seaborn vs Matplotlib charts with UI controls and generated code snippets
Hugging Face QuickStart Tool	Gradio tool that converts model/repo URLs into run commands, download snippets, file views, risk hints, and ZIP scaffolds

🧠 Selected ML, NLP & healthcare-style workflows

Project	Focus
Road Accident Risk Prediction	Two-stage risk scoring with LightGBM, XGBoost residual modeling, NNLS blending, stable OOF evaluation, and interpretable risk features
Cancer Risk Analysis	Clean tabular data, validation, leakage-aware benchmarking, and interpretable risk modeling for educational analytical use
Clinical Deterioration Early Warning	12-hour deterioration baseline with tabular models, probability ensembling, and cost-based threshold policy tables
Pima Diabetes Pipeline	End-to-end diabetes risk pipeline with EDA, feature engineering, calibration, cost-aware thresholding, and deployable artifacts
SMS Spam Detection	Dual TF-IDF pipeline with calibrated Linear SVM, nested CV, threshold tuning, explainability, and robustness checks
Text Sentiment Analysis	IMDB sentiment pipeline with calibrated TF-IDF baselines, threshold tuning, explainability, and BiLSTM baseline

📦 Selected data products

Dataset	What it enables
RAG QA Logs & Corpus	RAG evaluation with QA logs, retrieval events, corpus documents, and evidence-style review workflows
LLM Production Telemetry	Offline LLMOps telemetry for reliability, latency, cost, routing, drift, and triage-policy review
Cancer Risk Factors	Health, lifestyle, environmental, and genetic features for leakage-aware risk modeling
Global EV Infrastructure	Standardized EV charging data for geospatial analytics, planning, and network modeling
YouTube Shorts & TikTok Trends 2025	Short-form content analytics, trend exploration, creator benchmarks, and virality analysis
Digital Lifestyle & Mental Wellness	Behavioral signals for wellbeing analytics, cohort exploration, and predictive workflows

🛠️ Stack

Category	Tools
Languages & Core
Data & Analytics
ML / DL
NLP / CV / LLM
Apps & Interfaces
APIs & Serving
Monitoring & Quality

🤝 Open to collaborating on

🚀 Production ML & GenAI systems: FastAPI services, Dockerized delivery, evaluation-first workflows, and review-ready outputs
🧠 RAG reliability: retrieval attribution, grounded outputs, guardrails, and regression-friendly review
🗂️ Validated data products: clean schemas, documented pipelines, reusable notebooks, and ML-ready artifacts
📊 Decision-support tooling: monitoring, threshold policies, analytics interfaces, and operator workflows

Best contact: LinkedIn

_{If the work is useful, a ⭐ helps others find it.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tarek Masryo tarekmasryo

Achievements