Skip to content

GyaneshSamanta/PDT-Breach-Risk-Mitigation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDT Breach Risk Mitigation

Predicting which orders will breach their Promised Delivery Time — before they do.

Status Python ML


About

A supply-chain analytics project that mines a hypothetical e-commerce dataset to identify, explain, and predict Promised Delivery Time (PDT) breaches — the orders that arrive late and quietly erode customer trust.

  • Who? Anyone working on logistics analytics, fulfilment ops, or churn-via-experience.
  • What? EDA, feature engineering, and a model bake-off (Neural Networks, Random Forest, SVM, plus baselines) to predict delivery delays.
  • Where? Built on the public DataCo Supply Chain dataset.
  • When? 2024.
  • Why? A late delivery costs more than the delivery — it costs the next order. Knowing which orders are at risk lets ops intervene before the breach.

The Story

Every fulfilment team lives the same loop: orders flow in, promises go out, and somewhere between warehouse and doorstep, a chunk of those promises break. The interesting question isn't how many — it's which ones, and why.

This project walks that loop end-to-end:

  1. EDA uncovers where delays cluster — by region, product category, shipping mode, day-of-week.
  2. Feature engineering turns raw fields into signals: time-of-order, route distance proxies, category encodings, customer history.
  3. Model bake-off pits a Neural Network against Random Forest and SVM (with logistic regression and gradient boosting as sanity-check baselines).
  4. Evaluation compares them on recall, F1, and ROC-AUC, because in this domain false negatives (missed breaches) are the expensive kind of wrong.

The recommendations that fall out — inventory positioning, ship-mode reassignment, region-level SLAs — are where the analytics turn into action.

Gallery

Open Notebooks/main.ipynb for the full visual story — heatmaps, region breakdowns, model comparisons, and feature importances.


Tech Stack

Layer Tools
Language Python 3
Notebook Jupyter
Data pandas, numpy
Viz matplotlib, seaborn, plotly
ML scikit-learn (RF, SVM, LR, GBM), TensorFlow / Keras (NN)
Stats statsmodels

Repo Structure

PDT-Breach-Risk-Mitigation/
├── Dataset/
│   └── DataCoSupplyChainDataset.csv
├── Notebooks/
│   └── main.ipynb              # EDA + feature engineering + models
├── requirements.txt
├── LICENSE
└── README.md

Getting Started

git clone https://github.com/GyaneshSamanta/PDT-Breach-Risk-Mitigation.git
cd PDT-Breach-Risk-Mitigation

python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt

jupyter notebook Notebooks/main.ipynb

Contributing

PRs welcome that:

  • Swap the model bake-off for gradient-boosted trees with proper cross-validation.
  • Add SHAP explanations for individual breach predictions.
  • Wrap the model in a tiny FastAPI endpoint for ops integration.

License

See LICENSE. The DataCo dataset is third-party — check its original terms before redistribution.

Credits

  • Gyanesh Samanta — analysis, modelling, write-up.
  • DataCo — for the public supply-chain dataset that anchors the analysis.

About

Predicting Promised Delivery Time breaches in e-commerce supply chains using ML on the DataCo dataset.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors