Your evals chose the wrong model
Model A beats Model B on benchmarks, but performs worse in production. Offline metrics don't capture real-world failure modes or distribution shifts.
System-level ML audits by a Principal Machine Learning Engineer. I focus on model choice, evaluation, and inference decisions—and how they interact with deployment—to cut latency, reduce costs, and stabilize production behavior. No fluff. Just results.
Book a 30-Minute ML Systems AuditPrincipal ML Engineer · Former NSF-funded researcher · Production AI at scale
If your ML system behaves strangely, costs too much, or falls apart at scale—I've probably seen why. Here are patterns I diagnose regularly:
Model A beats Model B on benchmarks, but performs worse in production. Offline metrics don't capture real-world failure modes or distribution shifts.
The model is fast, but end-to-end latency is terrible. Preprocessing, tokenization, or serialization overhead destroys performance.
Your fine-tuned model scores better but users complain. Objective mismatch or overfitting masked by evaluation blind spots.
Retrieval quality degrades at scale. Context windows overflow silently. Latency balloons with corpus size.
Nothing changed in code, but outputs degraded. Unversioned prompts and templates create non-reproducible behavior.
What worked in development becomes financially unsustainable in production. No one modeled the real cost structure.
30 minutes
Rapid system-level diagnosis of your ML pipeline. Live session identifying bottlenecks, risks, and immediate wins.
1-2 weeks
Comprehensive analysis of your ML system: evals, model choices, inference paths, deployment, and observability.
2-4 weeks
Fix a specific critical issue. Design the solution, review implementation, unblock hard problems.
Davix Labs is led by Marcel Bischoff, a Principal Machine Learning Engineer with a background in mathematical physics and NSF-funded research. Marcel focuses on real-world AI systems in production—diagnosing ML decisions that break at scale and optimizing for performance, reliability, and cost.
Ready to diagnose what's breaking in your ML system?
Schedule a 30-Minute AuditOr email: marcel@localconformal.net