Skip to content

Lyse777/FUTURE_DS_02

Repository files navigation

Customer Retention & Churn Analysis

Project Overview

This project is my submission for Future Interns Data Science & Analytics - Task 2: Customer Retention & Churn Analysis.

The goal of this project is to analyze customer and subscription behavior for a SaaS-style business and turn the data into retention insights that a product manager, founder, or business stakeholder could actually use.

I treated this project like a real retention analytics case. Instead of only calculating churn, I looked at churn from different angles: account status, subscription churn, churn-event history, customer lifetime, support experience, feature usage, acquisition source, and customer cohorts.

Business Problem

For subscription-based businesses, growth does not only come from acquiring new customers. Growth also depends on keeping customers active, helping them experience value quickly, and understanding why they leave.

This analysis answers questions such as:

  • Why are customers leaving?
  • Which customer segments show higher churn risk?
  • How long do customers typically stay active?
  • Which retention drivers are visible in support, usage, and subscription data?
  • What actions can the business take to reduce customer loss?

Dataset

The dataset used in this project is the RavenStack Synthetic SaaS Dataset.

Dataset credit: River @ Rivalytics

The dataset is fully synthetic and designed for SaaS analytics practice. It includes multi-table customer, subscription, churn, support, and feature-usage data.

Data Files Used

File Description
ravenstack_accounts.csv Customer account profile, signup date, plan, country, referral source, and churn status
ravenstack_subscriptions.csv Subscription-level start dates, end dates, billing details, plan changes, and churn status
ravenstack_feature_usage.csv Product feature activity, usage volume, duration, errors, and beta-feature usage
ravenstack_support_tickets.csv Support ticket response time, resolution time, priority, satisfaction, and escalation data
ravenstack_churn_events.csv Churn events, churn reasons, refund amounts, upgrades, downgrades, reactivations, and feedback

Tools Used

  • Python
  • Pandas
  • NumPy
  • Matplotlib
  • ReportLab
  • GitHub
  • Customer retention analysis
  • Cohort analysis
  • Churn segmentation
  • KPI reporting
  • Business insight generation

Dashboard

The main dashboard is available here:

Customer Retention & Churn Dashboard

Key Performance Indicators

Metric Result
Accounts analyzed 500
Subscriptions analyzed 5,000
Feature usage records 25,000
Support tickets 2,000
Churn events 600
Snapshot account churn rate 22.0%
Subscription churn rate 9.7%
Accounts with churn-event history 70.4%
Median time to first churn event 2.7 months
3-month average subscription retention 97.2%
12-month average subscription retention 90.9%
MRR tied to churned subscriptions $1,179,139

Important Data Note

The dataset contains more than one way to view churn:

  1. Snapshot account churn from accounts.churn_flag
  2. Subscription churn from ended subscriptions in subscriptions
  3. Churn-event history from churn_events

I kept these views separate because each one answers a different business question.

  • Snapshot churn shows the current customer status.
  • Subscription churn shows how often subscription records ended.
  • Churn-event history shows churn behavior over time, including possible reactivation patterns.

This distinction makes the analysis more honest and useful for business decision-making.

Main Insights

1. Account churn exists, but churn-event history is much broader

The snapshot account churn rate is 22.0%, based on 110 churned accounts out of 500 accounts.

However, 70.4% of accounts have at least one churn event in the churn-events table. This suggests that churn behavior is event-driven and may include reactivations or repeated churn activity.

2. Subscription churn is moderate, but churned subscriptions leave early

The subscription churn rate is 9.7%.

The median duration of churned subscriptions is only 1.4 months, which means a lot of churn risk happens early in the customer lifecycle.

This makes onboarding, first value, and early product adoption very important.

3. Feature gaps are the top recorded churn reason

The largest churn reason is features, with 114 churn events.

Other major churn reasons include budget, support, unknown, competitor, and pricing.

This suggests that retention is not driven by only one issue. The business needs to improve product fit, support experience, and value communication together.

4. Event-sourced accounts have the highest snapshot churn rate

Accounts acquired from events show a snapshot churn rate of 30.2%.

Partner-sourced accounts have a lower snapshot churn rate of 14.6%.

This suggests that event-based acquisition may bring in customers with weaker fit or different expectations.

5. DevTools accounts show the highest industry churn risk

The DevTools segment has the highest snapshot churn rate at 31.0%.

This segment may need more technical onboarding, clearer documentation, and stronger product education.

6. Downgrades are a meaningful churn warning sign

Downgraded subscriptions have a churn rate of 11.5%, compared with 9.6% for non-downgraded subscriptions.

A downgrade should be treated as a retention risk signal, not just a plan change.

Recommendations

1. Build a 30-day onboarding and activation program

Because churned subscriptions have a short median duration, the business should focus on the first 30 days.

Recommended actions:

  • Create a structured onboarding checklist
  • Track first feature usage within the first week
  • Send targeted guidance to inactive new customers
  • Offer onboarding calls for high-value accounts
  • Monitor new accounts that have low usage or many early support tickets

2. Create feature adoption playbooks

Since feature-related churn is the largest churn reason, customers need to experience the most valuable product features earlier.

Recommended actions:

  • Identify the top features used by retained customers
  • Build in-app tips for underused features
  • Create use-case based onboarding flows
  • Track feature adoption by account segment
  • Trigger customer success outreach when adoption is low

3. Treat downgrades as churn-risk signals

Downgrades should trigger a retention workflow.

Recommended actions:

  • Automatically flag downgraded subscriptions
  • Ask customers why they downgraded
  • Offer a right-sized plan recommendation
  • Monitor usage after downgrade
  • Follow up before renewal

4. Improve acquisition quality from event channels

Event-sourced accounts have the highest snapshot churn rate.

Recommended actions:

  • Review event messaging and customer expectations
  • Improve qualification before conversion
  • Create event-specific onboarding
  • Compare event leads against partner and organic leads
  • Focus on attracting better-fit customers, not only more signups

5. Reduce churn caused by pricing and budget pressure

Pricing, budget, and competitor-related reasons together represent a major part of churn.

Recommended actions:

  • Build save-offer playbooks
  • Offer annual discounts for customers with stable usage
  • Improve value-based messaging
  • Show ROI or productivity benefits more clearly
  • Create flexible plans for budget-sensitive customers

6. Strengthen support experience for at-risk accounts

Support is one of the major churn reasons.

Recommended actions:

  • Monitor high-priority tickets for churn risk
  • Improve first-response times for urgent issues
  • Track satisfaction after ticket closure
  • Route repeated support issues to customer success
  • Create support dashboards by segment and plan tier

7. Create a retention command center

The business should monitor retention continuously instead of reviewing churn only after customers leave.

Recommended dashboard sections:

  • Monthly churn rate
  • Churn reasons
  • Cohort retention
  • Downgrade-risk accounts
  • Support-risk accounts
  • Feature adoption by segment
  • Churn by acquisition channel

Project Structure

FUTURE_DS_02_customer_retention_churn_analysis/
├── analysis/
│   ├── analysis_summary.json
│   ├── account_churn_by_country.csv
│   ├── account_churn_by_industry.csv
│   ├── account_churn_by_referral_source.csv
│   ├── account_churn_by_trial_status.csv
│   ├── churn_reason_summary.csv
│   ├── cohort_retention_matrix.csv
│   ├── feature_usage_summary.csv
│   ├── monthly_churn_trends.csv
│   ├── retention_curve.csv
│   ├── retention_driver_churn_history_comparison.csv
│   ├── retention_driver_snapshot_comparison.csv
│   ├── subscription_churn_by_billing_frequency.csv
│   ├── subscription_churn_by_country.csv
│   ├── subscription_churn_by_industry.csv
│   └── subscription_churn_by_plan.csv
├── charts/
│   ├── churn_by_industry.png
│   ├── churn_by_referral_source.png
│   ├── churn_reasons.png
│   ├── cohort_retention_heatmap.png
│   ├── customer_lifetime_distribution.png
│   ├── monthly_churn_trend.png
│   └── retention_curve.png
├── dashboard/
│   ├── Customer_Retention_Churn_Dashboard.pdf
│   ├── Customer_Retention_Churn_Dashboard.png
│   └── index.html
├── data/
│   ├── README.md
│   ├── raw/
│   │   ├── README.md
│   │   ├── ravenstack_accounts.csv
│   │   ├── ravenstack_churn_events.csv
│   │   ├── ravenstack_feature_usage.csv
│   │   ├── ravenstack_subscriptions.csv
│   │   └── ravenstack_support_tickets.csv
│   └── processed/
│       ├── account_level_retention_dataset.csv
│       ├── cohort_retention_long.csv
│       ├── monthly_retention_trends.csv
│       ├── subscription_level_cleaned.csv
│       └── support_usage_account_metrics.csv
├── docs/
│   ├── analysis_report.md
│   ├── linkedin_post.md
│   └── submission_summary.md
├── reports/
│   └── Customer_Retention_Churn_Analysis_Report.pdf
├── src/
│   └── churn_retention_analysis.py
├── .gitignore
├── LICENSE_DATA_NOTE.md
├── README.md
├── README_Customer_Retention_Churn_Analysis.md
└── requirements.txt

How to Run the Project

Install dependencies:

pip install -r requirements.txt

Run the analysis:

python src/churn_retention_analysis.py

The script reads the raw CSV files from data/raw/, regenerates the processed datasets, creates analysis tables, and exports dashboard visuals.

What I Learned

This task helped me understand how retention analytics connects directly to business growth. I learned that churn is not only a number, but a signal that can come from product value, customer expectations, pricing pressure, support experience, and onboarding quality.

I also learned the importance of separating different churn definitions. A snapshot churn flag, a subscription churn flag, and a churn-event table can each tell a different part of the customer story.

Author

Umuhire Gatesi Lyse
GitHub: Lyse777

About

Customer retention and churn analysis project using SaaS subscription data. Includes cohort analysis, churn patterns, customer lifetime trends, dashboard visuals, and business recommendations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors