Staff Data Scientist specializing in experimentation systems, causal inference, and AI evaluation frameworks.
LLM Experimentation Platform - Conversational behavioral metrics, and causal inference to evaluate prompt strategies, temperature, and model scaling.
AI Evals v2 — Behavioral reliability and context-length evaluation for LLM systems
Experimentation Platform — CUPED / DiD / A/B testing decision framework
NYC TLC Forecasting — Demand modeling and product analytics