Credit Risk Default Prediction System

This repository contains an end-to-end machine learning system for predicting mortgage loan default using Fannie Mae data. It includes training pipelines, model artifacts, and a Streamlit-based scoring app.

At-a-Glance (Results)

Task: Predict 24-month mortgage default probability (origination-time features only; no leakage)
Model: Logistic Regression (class-weighted baseline)
Data: Fannie Mae Single-Family Loan Performance (processed features; raw excluded)
Scale: 338,473 loans • default rate 0.78%
Performance (holdout): ROC-AUC 0.819 • PR-AUC 0.0536
Artifacts: artifacts/model.joblib, artifacts/feature_schema.json, artifacts/metadata.json
Demo: Streamlit scoring app (app/app.py) — upload CSV, get PD + flag + risk bucket

Repro proof: Metrics and training metadata are persisted in artifacts/metadata.json after running python3 -m src.train.

Overview

End-to-end credit risk modeling project using Fannie Mae mortgage data.
This repository demonstrates a full pipeline from data preparation and model training to a deployable inference application for scoring new loans.

The project is intentionally structured to reflect production-oriented workflows: training and inference are separated, model artifacts are persisted, and scoring is exposed through a lightweight application interface.

Project Overview

Objective:
Estimate the probability that a mortgage loan defaults within 24 months of origination, using only origination-time features (no post-origination data leakage).

Key components:

Data ingestion, labeling, and feature engineering
Baseline probability-of-default (PD) model
Reproducible training pipeline with persisted artifacts
Browser-based scoring app for inference

Repository Structure

.
├── src/
│   ├── train.py              # Training pipeline (produces model artifacts)
│   └── config.py             # Paths and configuration
├── app/
│   └── app.py                # Streamlit inference/scoring app
├── artifacts/
│   ├── model.joblib          # Trained model + preprocessing pipeline
│   ├── feature_schema.json   # Required input feature contract
│   └── metadata.json         # Training metadata and metrics
├── Notebooks/
│   ├── 01_data_ingestion_and_labeling.ipynb
│   ├── 02_feature_engineering.ipynb
│   └── 03_modeling_and_evaluation.ipynb
├── data/
│   └── processed/            # Processed feature data (raw files excluded)
├── requirements.txt
└── README.md

Data

The model is trained using Fannie Mae Single-Family Loan Performance data.

Raw quarterly loan files are not included in this repository due to size.
Labeling logic and feature construction are demonstrated in the notebooks.
The final model uses origination-level features only, avoiding look-ahead bias and data leakage.

Model

Algorithm: Logistic Regression (baseline)
Target: 24-month default indicator
Class imbalance: handled via class weighting
Evaluation metrics:
- ROC AUC
- Precision–Recall AUC
Thresholding: separated from training and configurable at inference time

The baseline model is intentionally simple and interpretable. The emphasis of this project is on end-to-end system design and reproducibility, not model complexity.

Training

Train and generate artifacts

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python3 -m src.train

Validate artifacts + metrics

ls -lah artifacts
python3 -c "import json; print(json.dumps(json.load(open('artifacts/metadata.json')), indent=2))"

Scoring App (Streamlit)

A lightweight inference application is included to score new loans using the trained model artifact. The app loads the persisted preprocessing + model pipeline and applies it deterministically at inference time.

Run the app

streamlit run app/app.py

The application opens in a browser and allows users to upload a CSV for scoring.

App Input Contract

The app expects a clean origination-level feature CSV with the following columns:

orig_interest_rate
orig_upb
orig_loan_term
property_type
loan_purpose
property_state
loan_type

Raw Fannie Mae quarterly performance files are intentionally not accepted by the app.
In a production system, ETL and labeling occur upstream; the scoring service operates on validated feature tables only.

App Output

For each loan, the app produces:

pd_default_24m — predicted probability of default within 24 months
flag — binary decision based on a configurable threshold
risk_bucket — coarse risk category (Low / Medium / High) for reporting

The decision threshold can be adjusted at runtime to support different risk policies (e.g., screening vs. underwriting).

Scored results can be downloaded as a CSV.

Notes on Modeling Choices

Probabilities are uncalibrated and optimized for ranking and screening.
Class imbalance is handled via class weighting.
Threshold selection is policy-dependent and intentionally separated from training.
The baseline model is intentionally simple; the emphasis of this project is on end-to-end system design and reproducibility rather than model complexity.
Because default is rare (~0.78%), PR-AUC is emphasized alongside ROC-AUC to reflect real screening performance under class imbalance.

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Risk Default Prediction System

At-a-Glance (Results)

Overview

Project Overview

Repository Structure

Data

Model

Training

Train and generate artifacts

Validate artifacts + metrics

Scoring App (Streamlit)

Run the app

App Input Contract

App Output

Notes on Modeling Choices

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
app		app
artifacts		artifacts
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Credit Risk Default Prediction System

At-a-Glance (Results)

Overview

Project Overview

Repository Structure

Data

Model

Training

Train and generate artifacts

Validate artifacts + metrics

Scoring App (Streamlit)

Run the app

App Input Contract

App Output

Notes on Modeling Choices

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages