This project performs RFM analysis (Recency, Frequency, Monetary) to segment customers based on their purchasing behavior.
Goal: identify Loyal, New, At Risk, and Lost customers and quantify their value to the business.
- Python: pandas, numpy
- Visualization: matplotlib, seaborn
- Environment: Jupyter Notebook
Customer_Segmentation_RFM/ │── data/ │ └── online_retail.xlsx │── images/ │ └── rfm_segments.png │── notebooks/ │ └── rfm_analysis.ipynb │── requirements.txt └── README.md
- Clean data: remove returns/duplicates, filter positive quantities, compute
TotalPrice. - Build RFM table at the CustomerID level:
- R (Recency): days since last purchase
- F (Frequency): number of invoices
- M (Monetary): total spend
- Score each metric into quartiles →
R_Score,F_Score,M_Score. - Combine into RFM_Score and map to Segments (Loyal, New, At Risk, Lost, Regular).
- Visualize distribution of segments and summarize average R/F/M per segment.
- Loyal customers show the highest Monetary value and above-average Frequency.
- At Risk customers have long Recency and still meaningful Monetary — targeted reactivation can pay off.
- A small VIP-like group (high F & M) contributes disproportionally to revenue (Pareto pattern).
- New customers can be guided to the next purchase via cross-sell bundles.
git clone https://github.com/Blladerunner/customer-segmentation-rfm-analysis.git
cd customer-segmentation-rfm-analysis
python -m venv .venv
.\.venv\Scripts\Activate.ps1 # Windows
# source .venv/bin/activate # macOS/Linux
pip install -r requirements.txt
python -m notebook notebooks/rfm_analysis.ipynb
