NYU Stern MSBAI Capstone
DriftBreaker
Credit Risk Model Drift Detection Using Survival Analysis
Framework Architecture
BI Analytics
Portfolio metrics, originations, defaults, book composition, cumulative PD curves
Model Engine
Discrete-time survival, PSI monitoring, early default sentinel, macro/micro attribution
Strategy
Margin analysis, variance attribution, required rate calculation, action matrix
Project Objectives
- Detect Model Drift: Build a framework to identify when credit risk models deviate from expected performance using survival analysis techniques.
- Attribution Analysis: Decompose drift into macro (economic) and micro (underwriting) components to inform remediation strategies.
- P&L Impact: Translate model drift into financial terms — margin compression, variance attribution, and required rate calculations.
- Actionable Decisions: Provide clear EXIT / REPRICE / MONITOR recommendations by segment based on quantitative analysis.
Lending Club
Pioneer in Peer-to-Peer Lending
Lending Club was founded in 2006 and became the world's largest peer-to-peer lending platform, facilitating over $50 billion in loans before transitioning to a neobank model in 2020. The platform connected borrowers seeking personal loans with investors looking for yield, disrupting traditional banking by removing the intermediary.
Company Timeline
Platform Statistics
Grading System
Lending Club assigned grades A through G to borrowers based on creditworthiness, with subgrades 1-5 within each letter grade. This risk stratification determined interest rates and was central to investor decision-making.
Portfolio Summary
Lending Club data (2007-2018)
By Segment
| Segment | Loans | Exposure | Default Rate (Count) | Default Rate ($) |
|---|---|---|---|---|
| High Risk | 188,630 | $3.41B | 30.38% | 31.53% |
| Medium Risk | 973,157 | $14.86B | 15.96% | 16.07% |
| Low Risk | 1,096,132 | $15.72B | 6.47% | 6.19% |
By Term
| Term | Loans | Exposure | Default Rate |
|---|---|---|---|
| 36 months | 1,607,316 | $20.50B | 10.70% |
| 60 months | 650,603 | $13.50B | 17.15% |
Timing Metrics
Originations vs Defaults
Annual and cumulative volume analysis
Annual Metrics
| Vintage | Loans | Volume | Defaults | Default Rate |
|---|---|---|---|---|
| 2007 | 251 | $2.2M | 45 | 17.93% |
| 2008 | 1,562 | $14.4M | 247 | 15.81% |
| 2009 | 4,716 | $46.4M | 594 | 12.60% |
| 2010 | 11,536 | $122.1M | 1,487 | 12.89% |
| 2011 | 21,721 | $261.7M | 3,297 | 15.18% |
| 2012 | 53,367 | $718.4M | 8,644 | 16.20% |
| 2013 | 134,814 | $1.98B | 21,030 | 15.60% |
| 2014 | 235,629 | $3.50B | 41,408 | 17.57% |
| 2015 | 421,095 | $6.42B | 76,851 | 18.25% |
| 2016 | 434,407 | $6.40B | 71,666 | 16.50% |
| 2017 | 443,579 | $6.59B | 44,854 | 10.11% |
| 2018 | 495,242 | $7.94B | 13,460 | 2.72% |
Early Default Analysis
First 12-month performance indicators
Early Default Rates by Vintage
| Vintage | PD6 | PD12 | Ultimate PD | Early/Ultimate Ratio |
|---|---|---|---|---|
| 2014 | 1.56% | 5.20% | 17.57% | 29.6% |
| 2015 | 1.72% | 5.43% | 18.25% | 29.8% |
| 2016 | 2.16% | 6.68% | 16.50% | 40.5% |
| 2017 | 2.15% | 6.27% | 10.11% | 62.0% |
Book Composition
Risk segment distribution over time
Default Curves
Cumulative PD development by vintage
Hazard Distribution
Survival Analysis
Discrete-time hazard modeling framework
Model Specification
Where:
h(t|X) = hazard at time t given covariates X
αt = baseline hazard (time-varying intercept)
β = coefficient vector for predictors
Key Covariates
- Grade (A-G risk classification)
- DTI ratio
- Annual income
- Employment length
- Home ownership status
- Loan purpose
Model Performance
Drift Detection
Population Stability Index and performance monitoring
| Metric | Threshold | Current | Status |
|---|---|---|---|
| PSI (Overall) | < 0.10 | 0.100 | MODERATE |
| A/E Ratio | 0.8 - 1.2 | 1.15x | SIGNIFICANT |
| KS Statistic (Avg) | < 0.05 | 0.074 | SIGNIFICANT |
Drift Detection Techniques Explained
The DriftSentinel system uses multiple statistical tests ("Dirty Dozen") to comprehensively detect distribution shifts between training and test data:
1. Population Stability Index (PSI)
Measures the difference between expected (training) and actual (test) score distributions. PSI < 0.10 indicates stable population, while PSI > 0.25 signals significant shift requiring model recalibration.
2. Actual/Expected (A/E) Ratio
Compares actual default rate to model-predicted default rate. A/E = 1.0 indicates perfect calibration. Values outside 0.8-1.2 range suggest model miscalibration or population shift.
3. Kolmogorov-Smirnov (KS) Statistic
Tests the maximum difference between cumulative distribution functions (CDFs) of baseline and current score distributions. KS < 0.05 indicates similar distributions, while higher values detect significant shifts.
4. Jensen-Shannon Divergence (JSD)
Symmetric measure of distribution distance based on Kullback-Leibler divergence. JSD ranges from 0 (identical) to 1 (completely different). Values > 0.12 indicate critical drift.
5. Wasserstein Distance
Measures the minimum "cost" to transform one distribution into another. Captures both location and shape differences. Higher values indicate larger magnitude of distribution shift.
6. Variance Ratio
Compares variance of current distribution to baseline. Ratio > 2.0 indicates increased volatility in model scores, suggesting unstable population characteristics.
7. Levene's Test
Tests homogeneity of variance between baseline and current distributions. P-value < 0.05 flags significant variance differences, indicating distribution instability.
8. Tail Shift (95th Percentile Ratio)
Compares extreme values (95th percentile) between distributions. Detects shifts in high-risk tail behavior, critical for credit risk models where tail events drive losses.
9. Adversarial AUC
Trains a Random Forest classifier to distinguish baseline vs current score distributions. AUC > 0.5 means distributions are distinguishable. AUC > 0.55 indicates meaningful drift, while AUC > 0.62 signals critical drift requiring immediate action.
Status Classification: The system aggregates these metrics to classify drift as STABLE (no action needed), WARNING (monitor closely), or CRITICAL (immediate recalibration required). Drift is measured against model scores on training data as the baseline.
Macro/Micro Attribution
Decomposing drift sources into economic vs underwriting factors
Attribution Methodology
When model drift is detected, the system decomposes the total drift into two components using log-space decomposition:
macro_multiplier = avg_macro_scalar
micro_multiplier = total_drift / macro_multiplier
log(total) = log(macro) + log(micro)
macro_attribution_pct = (log(macro) / log(total)) × 100
micro_attribution_pct = (log(micro) / log(total)) × 100
Total Drift: Ratio of drifted hazard rate (current period) to base hazard rate (baseline). Captures overall change in default probability.
Macro Attribution: Percentage of drift explained by economic conditions (unemployment, spreads, sentiment, etc.). Reflects external economic stress affecting all borrowers.
Micro Attribution: Percentage of drift explained by underwriting/population shifts. Reflects changes in borrower quality, underwriting standards, or product mix.
Example Interpretation: If macro attribution = 30% and micro attribution = 70%, this means economic stress accounts for 30% of increased defaults, while 70% is due to borrower quality changes or underwriting deterioration. This helps prioritize remediation: macro-driven drift may require portfolio-level adjustments, while micro-driven drift suggests underwriting policy review.
Macro Factors
Economic conditions affecting all borrowers uniformly
- • Unemployment rate changes
- • Interest rate environment
- • GDP growth fluctuations
- • Housing market conditions
- • Consumer sentiment shifts
- • High-yield spread movements
Micro Factors
Underwriting and population characteristics
- • Underwriting standard changes
- • Applicant population shift
- • Product mix changes
- • Geographic expansion
- • Credit policy modifications
- • Target market adjustments
Model Methodology
Forensic Portfolio Engine - Architecture & Key Assumptions
Three-Component Framework
1. Micro Scorecard
- • Logistic Regression (L2, C=0.1)
- • 18 borrower features
- • StandardScaler preprocessing
- • Output: 0-1 probability
2. Vintage Curve
- • Empirical default rate by quarter
- • Forensic date calculation
- • Normalized to mean = 1.0
- • Range: 0.4-1.5x multiplier
3. Macro Overlay
- • 5 economic indicators
- • Log-linear model (z-scores)
- • 1-quarter lag effect
- • Range: 0.7-1.8 scalar
Key Assumptions
1. Multiplicative Independence
Assumes independence between micro, vintage, and macro components. No interaction terms modeled.
2. Quarterly Discretization
Continuous time converted to discrete quarters. Months on book (MOB) mapped to quarters (q = floor(MOB/3)).
3. Fixed 1-Quarter Lag
Economic conditions affect loans issued in the next quarter. Q1 conditions impact Q2 originations.
4. Forensic Date Accuracy
Assumes `last_pymnt_d` accurately reflects true loan duration. Captures early payoffs and extended terms (1-60 months).
5. Default Assignment at Final Month
Defaults assigned only at the final month of loan duration, not throughout the loan lifecycle.
Key Facts
Feature Set (18 Features)
- Core Financial (4): dti, annual_inc, loan_amnt, avg_cur_bal
- Credit History (4): tot_cur_bal, tot_hi_cred_lim, bc_open_to_buy, credit_utilization_ratio
- Account Behavior (3): acc_open_past_24mths, num_tl_30dpd, mths_since_recent_bc
- Derogatory Marks (3): delinq_2yrs, pub_rec, collections_12_mths_ex_med
- Engineered (4): has_collections, has_delinq_history, has_public_record, income_to_loan_ratio
Macro Indicators (5)
- Unemployment: β = +0.15 (Higher → More defaults)
- HY Spread: β = +0.10 (Wider → More defaults)
- Yield Curve: β = -0.05 (Inverted → More defaults)
- Consumer Sentiment: β = -0.08 (Lower → More defaults)
- Real Income: β = -0.05 (Lower → More defaults)
Model Training Details
Hyperparameters
- • C = 0.1 (L2 regularization)
- • Solver = 'lbfgs'
- • max_iter = 1000
- • random_state = 42
Data Processing
- • StandardScaler (zero mean, unit variance)
- • Missing values filled with 0
- • Binary classification target
- • No class weights (maintains calibration)
Economics & P&L Calculation
Note: P&L uses actual default rates (realized performance), not predicted PD, to reflect true portfolio economics.
Drift Detection (DriftSentinel)
The system computes 7+ statistical metrics ("Dirty Dozen") to detect distribution shift:
Shape & Stability
- • Jensen-Shannon Divergence (JSD)
- • Wasserstein Distance
- • Kolmogorov-Smirnov Statistic
Volatility & Adversarial
- • Variance Ratio
- • Levene's Test (homogeneity)
- • Tail Shift (95th percentile)
- • Adversarial AUC (Random Forest)
Status Classification: CRITICAL (AUC > 0.62 or JSD > 0.12) | WARNING (AUC > 0.55 or JSD > 0.08) | STABLE (otherwise)
P&L Analysis
Margin impact and variance attribution
Margin Waterfall (High Risk Segment)
Hazard & Margin Analysis by Segment
| Segment | Q Hazard (Base) | Q Hazard (Drifted) | Annual PD | Coupon | Margin | Required |
|---|---|---|---|---|---|---|
| High Risk | 8.86% | 10.63% | 36.20% | 23.7% | -18.49% | 47.20% |
| Medium Risk | 4.63% | 5.55% | 20.42% | 15.2% | -11.23% | 31.42% |
| Low Risk | 1.99% | 2.39% | 9.23% | 9.1% | -6.14% | 20.23% |
Decision Matrix
Strategic recommendations by segment
High Risk
EXIT- 30.38% default rate
- -18.49% margin
- Required: 47.20%
Medium Risk
EXIT- 15.96% default rate
- -5.96% margin
- Required: 26.96%
Low Risk
REPRICE- 6.47% default rate
- -2.47% margin
- Required: 17.47%
DB
AI-powered credit risk assistant powered by Google Gemini
What is DB?
DB (DriftBreaker) is an intelligent AI assistant that helps you understand and analyze your credit risk portfolio. Powered by Google's Gemini 2.5 Flash model, DB has access to comprehensive portfolio context and can answer questions about:
Portfolio Analysis
- • Portfolio summary metrics
- • Default rates by segment
- • Vintage performance trends
- • Book composition changes
Drift Detection
- • Explain drift metrics (PSI, KS, JSD)
- • Interpret drift status
- • Identify drift root causes
- • Recommend remediation steps
Attribution Analysis
- • Macro vs micro attribution
- • Economic impact assessment
- • Underwriting quality analysis
- • Factor decomposition
Strategic Recommendations
- • Pricing adjustments
- • Underwriting policy changes
- • Portfolio rebalancing
- • Risk mitigation strategies
How it works: DB has access to your complete portfolio context including model architecture, drift metrics, attribution analysis, P&L data, and decision matrix recommendations. Simply ask questions in natural language, and DB will provide detailed explanations, insights, and actionable recommendations based on your portfolio's current state.