I build applied
ML Systems
that ship.
Specializing in model deployment, data pipelines, and backend systems that scale for production environments.
Delivered for by building that . Built with .
0 → 1 Ownership
Case Study: Failure-Aware ML System
How I built a cost-sensitive classification pipeline that reduced manual review by 68% while maintaining >96% recall on high-risk outcomes.
Phase 1: Problem Framing & Data
Framed the problem as asymmetric loss optimization, not accuracy. Loaded 1.3M Lending Club records with stratified train/val/test splits.
Phase 2: Cascade Architecture
Built two-stage inference: Logistic Regression gatekeeper for fast "easy" decisions, XGBoost specialist for uncertain cases.
Phase 3: Three-Way Triage
Implemented Pass/Defer/Reject decision policy. Optimized thresholds for expected business loss, not accuracy.
Phase 4: Validation & Impact
Validated on held-out test set. System achieves 98.7% recall while automating 67% of decisions—68% reduction in manual review.
Outcome: 32% reduction in high-cost false negatives with bounded precision trade-offs.
Key Tradeoffs
- vs Threshold Policy over Model Complexity: Decision policy dominates representation under asymmetric loss. Tuning thresholds beat adding model layers.
- vs Cascade over Single Model: Two-stage inference lets fast gatekeeper handle easy cases, reserving expensive specialist for uncertain inputs.
- vs Defer over Force Decision: Three-way triage (Pass/Defer/Reject) acknowledges uncertainty rather than forcing bad predictions.
What I Cut (Scope Discipline)
"Optimize for decision quality, not model sophistication."
- No deep learning (XGBoost sufficient for tabular)
- No real-time inference (batch decisions OK for credit)
- No automated retraining (concept drift v2 roadmap)
Stack for this role:
Selected Projects
Focused on bridging the gap between innovative research and production-grade software performance.
Systemic Lifecycle
How I think about building end-to-end ML products.
Ingestion & Data Flow
Automated pipelines that ingest terabytes of raw data, ensuring high fidelity and low-latency storage.
Spark, Kafka, SnowflakeTraining & Optimization
Scalable training loops with distributed GPU support and integrated hyperparameter tuning.
PyTorch, CUDA, HorovodEvaluation & Validation
Rigorous testing frameworks that detect drift, bias, and edge-case failures before deployment.
MLflow, Weights & BiasesDeployment & Inference
Optimized model serving using TensorRT and Triton for sub-millisecond response times.
K8s, Triton, FastAPIMonitoring & Observability
Full-stack monitoring of model performance and hardware health in live production environments.
Prometheus, GrafanaPublications & Research
Subtrajectory Clustering with ML on QC
SSTDM25 - Short Paper (Camera Ready)
Key Contributions
- Integrated Quantum Machine Learning kernels into traditional clustering workflows.
- Developed a novel distance metric for high-dimensional subtrajectory similarity.
- Optimized tensor network contractions, achieving 40% speedup in simulation.
My Role
- Lead developer for the open-source implementation in Python & Qiskit.
- Designed and executed large-scale benchmaring across 4 synthetic datasets.
- Authored the methodologies and results sections for the final manuscript.