Adaptive Checkout Risk Engine

A cost-aware, multi-dimensional fraud detection framework that moves beyond binary classification to deliver personalized, explainable transaction routing.

Overview

Traditional fraud systems rely on static rules like "block if amount > $50,000 AND device is new". While simple, these systems generate excessive false positives—declining legitimate customers, increasing cart abandonment, and eroding trust.

This project replaces that paradigm with an Adaptive Risk Engine that evaluates each transaction across four independent dimensions and routes it to the optimal action (Approve, OTP, or Block) using a mathematically defined Composite Decision Score (CDS).

Key Differentiator: The system optimizes for business cost (revenue saved minus friction imposed), not just statistical accuracy.

System Architecture

graph TD
    A[Customer Checkout] --> B[Transaction Data]
    B --> C[Feature Engineering]
    C --> D[Fraud Risk Model]
    C --> T[Trust Engine]
    C --> V[CRV Assessment]
    C --> N[Graph Intelligence]
    
    D --> E[Composite Decision Engine]
    T --> E
    V --> E
    N --> E
    
    E --> F{Cost-Aware Routing}
    
    F -->|Low Risk| G[Approve]
    F -->|Medium Risk| H[OTP]
    F -->|High Risk| I[Block]

    classDef approve fill:#d4edda,stroke:#28a745,stroke-width:2px,color:#000;
    classDef otp fill:#fff3cd,stroke:#ffc107,stroke-width:2px,color:#000;
    classDef block fill:#f8d7da,stroke:#dc3545,stroke-width:2px,color:#000;

    class G approve;
    class H otp;
    class I block;

Composite Decision Score (CDS)

The final routing decision is computed as:

$$CDS = w_1 \cdot \text{Risk} - w_2 \cdot \text{Trust} - w_3 \cdot \text{CRV} + w_4 \cdot \text{GraphRisk}$$

Where weights are tuned via the Business Cost Function to jointly minimize fraud loss, false-positive declines, and verification overhead.

Results

Trained on 150,000 samples from the IEEE-CIS Fraud Detection dataset (~2.65% fraud rate).

Metric	Score
ROC-AUC	0.9430
PR-AUC	0.7044
Test Samples	30,000
Fraud in Test	794 (2.65%)

ROC Curve

Precision-Recall Curve

Top 20 Feature Importances

Innovation Layer

This project goes significantly beyond standard fraud classification:

Component	What It Does	File
Behavioral Trust Engine	Computes a dynamic trust score from account age, transaction history, device/location consistency, and chargeback history. Acts as a counterweight to raw fraud probability.	`trust_engine.py`
Customer Relationship Value (CRV)	A proxy for customer lifetime value derived from transaction volume, longevity, and behavioral consistency. High-CRV customers receive reduced friction.	`crv_assessment.py`
Fraud Network Intelligence	Builds entity graphs (Customer ↔ Card ↔ Device ↔ IP ↔ Email) and extracts topological features like shared-device counts, fraud-neighbor scores, and centrality metrics.	`graph_network.py`
Business Cost Optimizer	Evaluates routing decisions against actual dollar losses: `(Fraud Loss × Missed Fraud) + (FP Cost × Declines) + (OTP Cost × Challenges)`	`cost_optimization.py`

Repository Structure

├── data/                       # IEEE-CIS dataset (git-ignored)
├── model/                      # Saved LightGBM weights
├── results/                    # Evaluation plots & metrics
│   ├── roc_curve.png
│   ├── pr_curve.png
│   ├── feature_importance.png
│   └── metrics.json
├── fraud_engine/               # Core ML & API logic
│   ├── run_pipeline.py         # End-to-end training & evaluation
│   ├── train.py                # LightGBM training with class weighting
│   ├── composite_score.py      # CDS routing logic
│   ├── trust_engine.py         # Behavioral trust scoring
│   ├── crv_assessment.py       # Customer value proxy
│   ├── graph_network.py        # NetworkX graph features
│   ├── temporal_velocity.py    # Time-based feature extraction
│   ├── cost_optimization.py    # Business cost function
│   ├── baseline_model.py       # Rule-based baseline benchmark
│   ├── explainability.py       # SHAP model explainer
│   └── api.py                  # FastAPI inference endpoint
├── requirements.txt
└── README.md

Quick Start

1. Clone & Install

git clone https://github.com/RudrakshChugh/Fraud_Detection.git
cd Fraud_Detection
python -m venv venv && .\venv\Scripts\activate   # Windows
pip install -r requirements.txt

2. Train the Model

Place the IEEE-CIS dataset CSVs inside data/, then run:

python fraud_engine/run_pipeline.py

This will train the model, save weights to model/, and generate all evaluation plots in results/.

3. Launch the API

cd fraud_engine
uvicorn api:app --reload

4. Test a Transaction

curl -X POST http://localhost:8000/evaluate_risk \
  -H "Content-Type: application/json" \
  -d '{"transaction_id":"TXN-001","customer_id":"C-42","amount":1200.0,"device_id":"D-99","ip_address":"192.168.1.1","email_domain":"gmail.com"}'

ML Methodology

Data & Preprocessing

Dataset: IEEE-CIS Fraud Detection (590K+ transactions, 400+ features)
Preprocessing: Median imputation, label encoding, removal of columns with >80% null values
Class Imbalance: Handled via dynamic scale_pos_weight and stratified train/test splits

Feature Engineering

Category	Features
Temporal & Velocity	Transactions per hour/day, time since last transaction, weekend flags
Behavioral	Average spend, spending deviation, amount percentiles
Graph-Derived	Accounts per device, devices per card, shared-IP count, fraud-neighbor count

Explainability (SHAP)

Every flagged transaction is accompanied by human-readable explanations:

Risk Score: 87/100

Multiple accounts linked to current device (Graph Alert)

Unusual transaction amount vs. historical baseline

High Customer Relationship Value (Mitigating Factor)

Tech Stack

Layer	Technology
ML Framework	LightGBM, XGBoost, Scikit-learn
Graph Analysis	NetworkX
Explainability	SHAP
API	FastAPI + Uvicorn
Visualization	Matplotlib, Seaborn

Future Work

Online Learning — Continuous model updates to adapt to concept drift
Streaming Inference — Apache Kafka integration for ultra-low latency event processing
Federated Learning — Privacy-preserving collaborative fraud detection across merchant networks

License

This project is for educational and portfolio purposes. The IEEE-CIS dataset is subject to its own Kaggle competition rules.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptive Checkout Risk Engine

Overview

System Architecture

Composite Decision Score (CDS)

Results

ROC Curve

Precision-Recall Curve

Top 20 Feature Importances

Innovation Layer

Repository Structure

Quick Start

1. Clone & Install

2. Train the Model

3. Launch the API

4. Test a Transaction

ML Methodology

Data & Preprocessing

Feature Engineering

Explainability (SHAP)

Tech Stack

Future Work

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
fraud_engine		fraud_engine
model		model
results		results
.gitignore		.gitignore
Project_Overview.md		Project_Overview.md
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Adaptive Checkout Risk Engine

Overview

System Architecture

Composite Decision Score (CDS)

Results

ROC Curve

Precision-Recall Curve

Top 20 Feature Importances

Innovation Layer

Repository Structure

Quick Start

1. Clone & Install

2. Train the Model

3. Launch the API

4. Test a Transaction

ML Methodology

Data & Preprocessing

Feature Engineering

Explainability (SHAP)

Tech Stack

Future Work

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages