Lalith Kumar

Available for work

Lalith Kumar

Data Scientist

Fairfax, Virginia

Data Scientist with 5 years of experience building, validating, and deploying statistical and machine learning models that quantify credit risk and forecast loan portfolio performance at scale. Skilled in predictive modeling, time series forecasting, clustering, and model governance across billion-record datasets using Python, Conda, AWS, Spark, and H2O. Experienced partnering with cross-functional teams of data scientists, software engineers, and product managers to translate ambiguous business problems into validated, production-ready solutions for bank volume, revenue, and expense forecasting.

Get in touch

01Intro

About

02Stack

Skills & Tools

Programming Languages

Python

SQL

PySpark

Scala

ML Frameworks & Libraries

Scikit-learn

XGBoost

TensorFlow

PyTorch

H2O AutoML

LightGBM

pandas & NumPy

Big Data & Cloud

Apache Spark

AWS (SageMaker, EC2, S3, Glue, EMR)

GCP (Vertex AI, BigQuery)

MLOps & Model Governance

MLflow

CI/CD & Git

Databases & Visualization

Power BI

PostgreSQL

Snowflake

Tableau

Amazon Redshift

Statistical & ML Methods

Time Series Forecasting

Classification & Regression

SHAP & Model Explainability

A/B Testing & Backtesting

03Career

Experience

Data Scientist

2023-11 — Present

Apollo Global Management

Partnered with a cross-functional team of data scientists, software engineers, and product managers to build machine learning models forecasting loan portfolio volume, revenue, and credit losses across a multi-billion-dollar book.
Used Python, Conda, AWS, H2O AutoML, and Apache Spark to surface insights from large volumes of structured and unstructured loan, market, and transaction data, informing credit risk and portfolio strategy decisions.
Built and trained classification and time-series forecasting models end to end using Amazon SageMaker, from design through training, evaluation, validation, and deployment, improving credit loss forecast precision by 22%.
Developed challenger models to benchmark against deployed champion models, applying rigorous backtesting and statistical validation, including confusion matrix and ROC/AUC analysis, to recommend production updates.
Built scalable PySpark and SQL pipelines on AWS Glue and EMR to retrieve, combine, and analyze data from structured and unstructured sources, improving data availability and reproducibility by 40% across the team.
Presented loss outlook and allowance impacts to non-technical executives through clear visualizations and storytelling, translating complex model risk findings into actionable decisions for senior stakeholders.

Data Scientist

2020-03 — 2023-08

Elevance Health

Built, validated, and backtested classification and risk-scoring models using Scikit-learn, XGBoost, and logistic regression on large-scale claims datasets, achieving an 87% F1-score and 92% recall in production.
Developed a clinical risk stratification platform that automated patient risk scoring, improving population risk segmentation accuracy by 22% and cutting manual triage time by 30% for the care management team.
Built scalable ETL pipelines with Python, PySpark, and SQL on AWS to combine claims, billing codes, and clinical notes into high-quality, analysis-ready datasets for downstream modeling and reporting.
Tuned models via grid search and Bayesian optimization and applied SHAP explainability to deliver statistically grounded interpretations to clinical and compliance stakeholders ahead of model approval.
Partnered with product, actuarial, and engineering teams across Agile sprints to translate model outputs into actionable insights, presenting findings to leadership through Tableau and Power BI dashboards.

04Studies

Education

Master of Science in Computer Science (STEM) · Computer Science

University of Central Missouri

05Credentials

Certificates

Google Cloud Professional Machine Learning Engineer

Google Cloud

Databricks Generative AI Fundamentals

Databricks

Databricks AI Agent Fundamentals

Databricks

06Say hi

Let's Connect

Got an idea, a role, or a problem worth solving? Drop me a message.