Karthik Mettu
I build production ML systems that deliver measurable business outcomes — from real-time fraud scoring at PayPal to demand forecasting across 60K+ SKUs at Accenture.
By The Numbers
Professional Experience
Data Scientist — Fraud Detection & Generative AI
2024- Improved fraud recall by 12–18% through feature engineering (transaction velocity, device behavior, merchant risk) with zero increase in false positives
- Designed a Generative AI fraud explanation system using AWS Bedrock with structured prompt-engineering pipelines grounded in verified model outputs
- Reduced analyst investigation time by 25–35% and clarification requests by 22% through human-readable LLM explanations
- Built explainability data layer on Snowflake (materialized views, partitioned tables) meeting millisecond-level SLAs
- Implemented hallucination-control metrics via MLflow; reduced misleading explanations by 30% through A/B testing
Data Scientist — Demand Forecasting & Supply Chain
2021 – 2023- Improved forecast accuracy by 23% over legacy system using hybrid ARIMA/SARIMA/Prophet + CatBoost models across 60K+ SKUs and 1,200+ stores
- Reduced peak-season forecast errors by 27% with probabilistic boosting for high-volatility SKUs
- Designed hierarchical forecasting (SKU → store → region), improving regional accuracy by 19%
- Built automated ETL pipelines (Python, SQL, Airflow) processing 100M+ records/day
- Implemented anomaly detection (Isolation Forest, z-score) reducing forecast failures by 34%
Featured Projects
Each project tells a story — click any card to read the full case study with problem, approach, results, and business value.
Audiobook Pipeline Flagship
Emotion-aware text-to-speech with sentence-level sentiment, Kokoro-82M voice blending, and Karpathy-style autoresearch optimization. Shipped with Streamlit demo.
Movie Recommender System
Dual-approach recommendation engine combining content-based similarity with collaborative filtering, deployed as an interactive Streamlit web app.
Sentiment Analysis Engine
Transformer-based NLP classifier achieving 95% accuracy on large-scale social media data. Fine-tuned for nuanced sentiment detection beyond pos/neg/neutral.
Diabetes Risk Predictor
ML-powered risk assessment for diabetes using logistic regression, random forest, and clinical feature engineering for early intervention.
Multivariate Statistical Analysis
Applied PCA, factor analysis, and clustering to high-dimensional datasets, reducing noise and uncovering hidden structure for downstream ML pipelines.
Data Storytelling Dashboards
Interactive dashboards and data storytelling across Tableau, Power BI, and Python, translating complex analytics into stakeholder-friendly narratives.
Suicide Rate Forecasting
Time-series forecasting with ARIMA and Prophet for public health resource allocation, delivering reliable multi-year projections for intervention planning.
Technical Skills
Languages
Machine Learning & AI
Data Engineering
Cloud & MLOps
Visualization
Statistics
Education
MS in Statistics & Data Science
30 credit-hour program covering the full data science stack — from statistical theory through deep learning to production analytics.
Get In Touch
Open to Data Scientist, ML Engineer, and Applied Scientist roles across the US. Let's talk.