Skip to content

ShaikhWarsi/IEEE-CIS-Fraud-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IEEE-CIS-Fraud-detection

Build a high‑performance fraud detection model using XGBoost, leveraging unique cardholder identifiers and sophisticated feature engineering on the IEEE‑CIS dataset. The pipeline normalizes temporal features, applies frequency and group aggregations, and produces a submission ready for Kaggle competition.

Project Overview

The core of this solution is the identification of unique cardholders (UIDs) and the aggregation of their transaction behavior over time. By normalizing temporal features and analyzing transaction patterns, the model can effectively distinguish between legitimate users and fraudulent actors.

Key Features

  • D-Column Normalization: Converting relative time deltas to absolute points in time for stability.
  • Cardholder UID Creation: Combining multiple card and address features to track individual credit cards.
  • Advanced Encodings:
    • Frequency Encoding for high-cardinality features.
    • Group Aggregations (Mean, Std, Nunique) based on cardholder UIDs.
  • Optimized Pipeline: Uses pd.concat to avoid DataFrame fragmentation and improve performance.

Kaggle Results

Below is the result of our model performance on the Kaggle leaderboard:

Kaggle Result

Getting Started

Prerequisites

  • Python 3.x
  • pandas
  • numpy
  • xgboost
  • scikit-learn

Usage

  1. Place the competition datasets (train_transaction.csv, train_identity.csv, test_transaction.csv, test_identity.csv) in the root directory.
  2. Run the training script:
    python xgb_magic_model.py
  3. The script will generate a submission_xgb_magic.csv file ready for Kaggle submission.

Model Report

A detailed technical report of the model architecture and feature engineering process can be found in XGBoost_Model_Report.md.

About

XGBoost fraud detection pipeline with advanced feature engineering for IEEE-CIS, focusing on cardholder UID, D‑column normalization, and group aggregations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages