Skip to content

Latest commit

 

History

History
391 lines (297 loc) · 11.3 KB

File metadata and controls

391 lines (297 loc) · 11.3 KB

RIFT 2026 — Money Muling Detection Engine (Graph Theory Track)

Team Name : Basant

Tech Stack: FastAPI · NetworkX · Pandas · React · Cytoscape.js


Overview

A production-grade AML (Anti-Money Laundering) graph intelligence engine that detects money mule networks in financial transaction data using purely deterministic, explainable graph algorithms — zero machine learning.

Detected Patterns

Pattern Algorithm Description
Circular Fund Routing Depth-limited DFS Directed cycles of length 3–5
Smurfing Sliding-window fan-in/fan-out ≥10 unique counterparties within 72 h
Layered Shell Networks Shell-DFS Chains ≥3 through low-activity nodes

Architecture Diagram

flowchart TD
    User(["👤 Analyst / User"])
    Upload["📁 CSV Upload\n(React Frontend)"]
    API["⚡ FastAPI\nPOST /analyze"]
    Validator["🔍 Validator\nSchema · Type Coercion · Timestamp Parsing"]
    GraphBuilder["🕸️ TransactionGraph\nNetworkX DiGraph\nAdjacency Lookups O(n)"]

    subgraph Detectors["Detection Layer (Parallel)"]
        Cycle["🔄 CycleDetector\nDepth-limited DFS\nCycles len 3–5"]
        Smurf["💸 SmurfDetector\nSliding-window 72h\nFan-in / Fan-out ≥10"]
        Shell["🐚 ShellDetector\nShell-DFS\nChains ≥3 low-activity nodes"]
    end

    FeatureExtractor["📊 FeatureExtractor\nStructural · Flow · Temporal"]
    RiskScorer["⚖️ RiskScorer\nWeighted Rules\n+ MerchantDampeningRule"]
    Aggregator["🗂️ RingAggregator\n+ RingConsolidator"]
    JSONOut["📦 JSON Response"]

    subgraph Frontend["React Frontend"]
        SummaryCards["📈 SummaryCards"]
        RingTable["📋 RingTable"]
        GraphView["🌐 Cytoscape.js\nGraph Visualisation"]
        SuspiciousAcct["🚨 SuspiciousAccounts"]
    end

    User --> Upload
    Upload -->|"HTTP POST multipart/form-data"| API
    API --> Validator
    Validator -->|"Validated DataFrame"| GraphBuilder
    GraphBuilder --> Cycle
    GraphBuilder --> Smurf
    GraphBuilder --> Shell
    Cycle --> FeatureExtractor
    Smurf --> FeatureExtractor
    Shell --> FeatureExtractor
    FeatureExtractor --> RiskScorer
    RiskScorer --> Aggregator
    Aggregator --> JSONOut
    JSONOut -->|"JSON"| SummaryCards
    JSONOut -->|"JSON"| RingTable
    JSONOut -->|"JSON"| GraphView
    JSONOut -->|"JSON"| SuspiciousAcct

    style Detectors fill:#1e3a5f,stroke:#3b82f6,color:#fff
    style Frontend fill:#1a3a2a,stroke:#22c55e,color:#fff
    style Validator fill:#3b1f1f,stroke:#ef4444,color:#fff
    style GraphBuilder fill:#2a2a3a,stroke:#a78bfa,color:#fff
    style FeatureExtractor fill:#2a2a3a,stroke:#a78bfa,color:#fff
    style RiskScorer fill:#2a2a3a,stroke:#a78bfa,color:#fff
    style Aggregator fill:#2a2a3a,stroke:#a78bfa,color:#fff
Loading

Tech Stack

Backend

  • FastAPI — REST API server
  • NetworkX — directed graph construction and analysis
  • Pandas — data loading and temporal sliding windows
  • NumPy — statistical feature computation
  • Uvicorn — ASGI server

Frontend

  • React 18 — UI framework
  • Cytoscape.js — interactive graph visualisation
  • cytoscape-cola — force-directed layout
  • Axios — HTTP client

Algorithm Details

1. Cycle Detection (Circular Fund Routing)

Algorithm: Depth-limited iterative-deepening DFS
Complexity: O(V × d^5) where d = average out-degree (fast for sparse graphs)

For each node v:
  DFS(v, path=[v], depth_limit=5)
    For each successor u of current node:
      If u == v AND len(path) ∈ [3,5]:
        → record cycle
      Elif u not in path:
        → recurse

Normalise: rotate cycle to lex-smallest prefix → deduplicate with set

Parameters: min cycle length = 3, max = 5

2. Smurfing Detection (Fan-in / Fan-out)

Algorithm: O(n log n) two-pointer sliding window
Complexity: O(n log n) — dominated by timestamp sort per account

For each account a:
  Sort transactions by timestamp
  Two-pointer window of width 72 hours:
    Count unique counterparties in window
    If count ≥ 10 → SMURFING RING

Thresholds: 72-hour window, ≥10 unique counterparties

3. Layered Shell Network Detection

Algorithm: Shell-filtered DFS
Complexity: O(V × k × d^k) where k = max_chain_depth (capped at 8)

Precompute shell_accounts = {a : tx_count(a) ≤ 3}

For each non-shell start node s:
  DFS through only shell intermediaries
  If path length ≥ 3 → record chain

Shell threshold: accounts with total_tx_count ≤ 3


Suspicion Score Methodology

No ML. Fully deterministic weighted scoring:

Signal Score
Cycle length 3 +40
Cycle length 4 +35
Cycle length 5 +30
Smurfing fan-in hub +45
Smurfing fan-out hub +40
Smurfing member +20
Shell intermediary +25
Shell final beneficiary +20
High clustering coefficient (>0.5) +10
High burst score (>0.7) +15
High uniformity score (>0.8) +10
High tx velocity (>10/hr) +10
  • Scores capped at 100, floored at 0
  • Rounded to 2 decimal places

False Positive Control — Merchant Dampening

High-volume legitimate merchants (e.g. payment processors) can exhibit fan-out patterns without being money mules. The dampening rule fires when:

distributor_flag = True
AND time_span_hours > 720   # active > 30 days
AND amount_variance > median_variance   # diverse transaction sizes
AND unique_receivers > 50   # genuinely broad customer base

Effect: multiply raw score × 0.70 (−30% reduction) and add "merchant_dampening_applied" to detected_patterns for auditability.


Complexity Analysis

Operation Complexity
Graph construction O(n)
Cycle detection O(V × d^5)
Smurfing detection O(n log n)
Shell detection O(V × d^8) with pruning
Feature extraction O(n + E)
Risk scoring O(A) — A = accounts
Total ≤ O(n log n) practical

Designed to handle 10,000 transactions in ≤ 30 seconds.


JSON Output Format

{
  "suspicious_accounts": [
    {
      "account_id": "ACC_F001",
      "suspicion_score": 87.50,
      "detected_patterns": ["cycle_length_3", "high_velocity"],
      "ring_id": "RING_001"
    }
  ],
  "fraud_rings": [
    {
      "ring_id": "RING_001",
      "member_accounts": ["ACC_F001", "ACC_F002", "ACC_F003"],
      "pattern_type": "cycle",
      "risk_score": 87.50
    }
  ],
  "summary": {
    "total_accounts_analyzed": 500,
    "suspicious_accounts_flagged": 15,
    "fraud_rings_detected": 4,
    "processing_time_seconds": 2.30
  }
}

Installation Guide

Prerequisites

  • Python 3.11+
  • Node.js 18+

Backend

# From project root
pip install -r requirements.txt

# Run development server
uvicorn backend.main:app --reload --port 8000

Backend available at http://localhost:8000
Swagger UI at http://localhost:8000/docs

Frontend

cd frontend
npm install
npm start

Frontend available at http://localhost:3000

The React dev server proxies /graph-data and /analyze to localhost:8000.


Usage

  1. Generate or prepare a CSV with columns: transaction_id, sender_id, receiver_id, amount, timestamp

  2. Generate the included sample:

    python generate_sample.py
    # → sample_transactions.csv (500 rows with embedded fraud patterns)
  3. Open the web app, upload the CSV, click Analyse Transactions.

  4. Explore:

    • Graph: red nodes = high risk, purple border = in a ring
    • Click any node to see its scores and patterns
    • Fraud Rings table: expand rows to see all member accounts
    • Download JSON button: exports the full detection report

API Endpoints

Method Path Description
GET /health Liveness check
POST /analyze Full analysis → JSON report
POST /graph-data Full analysis + Cytoscape elements

Both POST endpoints accept multipart/form-data with a file field.


Deployment (Render - Free Tier)

Backend (Web Service)

  1. Push to GitHub
  2. New Web Service on render.com
  3. Root directory: backend
  4. Build: pip install -r requirements.txt
  5. Start: uvicorn main:app --host 0.0.0.0 --port $PORT

Frontend (Vercel or Render Static Site)

Option A: Vercel (Recommended for frontend)

  1. Go to vercel.com
  2. Import your GitHub repository
  3. Root directory: frontend
  4. Env var: VITE_API_URL=https://mm-detection.onrender.com
  5. Deploy! See VERCEL.md for details.

Option B: Render Static Site

  1. New Static Site on Render
  2. Root directory: frontend
  3. Build: npm install && npm run build
  4. Publish: frontend/dist
  5. Env var: VITE_API_URL=https://mm-detection.onrender.com

No credit card required! See DEPLOYMENT.md or VERCEL.md for detailed instructions.


Folder Structure

MM_Detection/
├── backend/
│   ├── main.py                  # FastAPI app
│   ├── requirements.txt
│   ├── graph/
│   │   └── builder.py           # TransactionGraph (DiGraph + lookups)
│   ├── detection/
│   │   ├── cycle_detector.py    # DFS cycle detection
│   │   ├── smurf_detector.py    # Sliding-window smurfing
│   │   └── shell_detector.py    # Layered shell chains
│   ├── scoring/
│   │   ├── feature_extractor.py # Per-account feature computation
│   │   └── risk_scorer.py       # Deterministic weighted scoring
│   └── utils/
│       ├── validator.py         # CSV schema validation
│       └── aggregator.py        # Ring merging + risk aggregation
├── frontend/
│   ├── package.json
│   └── src/
│       ├── App.js
│       ├── index.js
│       ├── index.css
│       └── components/
│           ├── Upload.js        # Drag-and-drop CSV upload
│           ├── GraphView.js     # Cytoscape.js graph
│           ├── RingTable.js     # Fraud ring summary
│           └── SuspiciousAccounts.js
├── generate_sample.py           # Sample data generator
├── requirements.txt             # Root-level (for Render)
├── Procfile                     # Heroku/Railway start command
├── render.yaml                  # Render deployment config
└── README.md

Known Limitations

  1. Cycle detection depth: capped at 5 hops. Deeper laundering chains are not detected (by design — hackathon constraint). Can be raised via MAX_CYCLE_LEN.

  2. Smurfing threshold: 10 unique counterparties. Adjust FAN_THRESHOLD in smurf_detector.py for different regulatory contexts.

  3. Shell threshold: ≤3 total transactions classifies an account as a shell. May cause false positives for genuinely new accounts.

  4. Graph layout: Cytoscape cola layout can be slow for >1,000 nodes. For very large graphs the fallback cose layout is used.

  5. No persistence: Results are not stored; each upload is a fresh analysis session.

  6. CSV size: Tested up to 10,000 transactions. Beyond this, performance is not guaranteed within 30 seconds.


Team / Hackathon

RIFT 2026 · Money Muling Detection Challenge · Graph Theory Track
Submission date: February 2026