Business SQL Analytics — PostgreSQL · Python · Tableau · Excel

What This Project Does

End-to-end business analytics pipeline built on a simulated retail dataset: customers, orders, products, and transactions across 4 normalized tables.

26+ SQL queries organized from data validation → aggregations → transaction analysis → multi-table joins, with Python automation for data generation and Tableau dashboards for stakeholder reporting.

Pipeline: PostgreSQL → SQL Analytics → Python Export → Tableau + Excel

Dataset: 2,314 customers · 5,000 transactions · $2.58M revenue · 14 months

Business Questions & Findings

Question	Finding
Who are the top 10 customers by revenue?	Top customer: $5,133 total spend · avg customer $1,115
Which payment methods dominate?	Across 5,000 transactions, avg ticket $515.85
What is the monthly revenue trend?	Peak month: $240,920 (446 orders) · low: $79,918
Which product categories drive most sales?	$2,579,272 total revenue across all categories
Returning vs one-time customers?	59.8% returning (1,384 of 2,314) vs 40.2% one-time

SQL — Advanced Patterns

Window Function: Customer Revenue Ranking

SELECT
    c.customer_name,
    SUM(t.amount)                                               AS total_spent,
    RANK() OVER (ORDER BY SUM(t.amount) DESC)                  AS revenue_rank,
    ROUND(SUM(t.amount) / SUM(SUM(t.amount)) OVER () * 100, 1) AS pct_of_total
FROM customers c
JOIN transactions t ON c.customer_id = t.customer_id
GROUP BY c.customer_name
ORDER BY revenue_rank;

CTE: Monthly Revenue with Month-over-Month Change

WITH monthly_revenue AS (
    SELECT
        DATE_TRUNC('month', transaction_date) AS month,
        SUM(amount)                           AS revenue,
        COUNT(DISTINCT customer_id)           AS active_customers
    FROM transactions
    GROUP BY 1
)
SELECT
    month,
    revenue,
    active_customers,
    ROUND(revenue - LAG(revenue) OVER (ORDER BY month), 0) AS mom_change
FROM monthly_revenue
ORDER BY month;

Multi-Table JOIN: Full Customer Order Profile

SELECT
    c.customer_name,
    COUNT(DISTINCT o.order_id)  AS total_orders,
    SUM(t.amount)               AS total_spent,
    ROUND(AVG(t.amount), 2)    AS avg_transaction,
    MAX(t.transaction_date)     AS last_purchase
FROM customers c
JOIN orders o       ON c.customer_id = o.customer_id
JOIN transactions t ON o.order_id    = t.order_id
GROUP BY c.customer_name
ORDER BY total_spent DESC
LIMIT 10;

Project Structure

mlops_sql_project/
├── sql/
│   ├── ddl/                    # schema, tables, constraints
│   ├── dml/                    # data inserts & resets
│   └── dql/
│       ├── a_checks/           # data validation (nulls, FK, indexes)
│       ├── b_aggregations/     # counts, min/max, category stats
│       ├── c_transactions/     # order & payment analysis
│       └── d_joins/            # multi-table joins, top customers
├── python/
│   ├── generate_customers.py
│   ├── generate_orders.py
│   ├── generate_products.py
│   ├── generate_transactions.py
│   ├── sales_summary.py
│   └── customer_behavior.py
├── Tableau/
│   ├── sales_dashboard.twb
│   ├── transaction_analysis.twb
│   └── customer_insights.twb
├── excel/
│   ├── sales_summary.xlsx
│   ├── transactions_overview.xlsx
│   └── customer_behavior.xlsx
└── env/                        # DB config, logging, settings

Architecture

PostgreSQL (4 tables: customers, orders, products, transactions)
    └── SQL layer: DDL → DML → DQL (26+ queries)
            └── Python automation (data generation + export)
                    ├── Tableau dashboards (sales, transactions, customers)
                    └── Excel reports (summary exports)

How to Run

# 1. Clone and set up environment
git clone https://github.com/evgeniimatveev/business-sql-analytics.git
cd business-sql-analytics
conda env create -f environment.yaml
conda activate mlops_env

# 2. Configure DB credentials
cp env/.env.example env/.env

# 3. Generate synthetic data
python python/generate_customers.py
python python/generate_orders.py
python python/generate_products.py
python python/generate_transactions.py

# 4. Run SQL analytics
# Open sql/dql/ queries in DBeaver or psql

Stack

Layer	Technology
Database	PostgreSQL
Analytics	SQL (CTEs, Window Functions, Multi-table JOINs)
Automation	Python (Faker, pandas, psycopg2)
Visualization	Tableau
Reporting	Excel
CI/CD	GitHub Actions

Connect

GitHub: evgeniimatveev
Portfolio: datascienceportfol.io/evgeniimatveevusa
LinkedIn: Evgenii Matveev

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Business SQL Analytics — PostgreSQL · Python · Tableau · Excel

What This Project Does

Business Questions & Findings

SQL — Advanced Patterns

Window Function: Customer Revenue Ranking

CTE: Monthly Revenue with Month-over-Month Change

Multi-Table JOIN: Full Customer Order Profile

Project Structure

Architecture

How to Run

Stack

Connect

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
Tableau		Tableau
env		env
excel		excel
python		python
sql		sql
README.md		README.md
environment.yaml		environment.yaml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Business SQL Analytics — PostgreSQL · Python · Tableau · Excel

What This Project Does

Business Questions & Findings

SQL — Advanced Patterns

Window Function: Customer Revenue Ranking

CTE: Monthly Revenue with Month-over-Month Change

Multi-Table JOIN: Full Customer Order Profile

Project Structure

Architecture

How to Run

Stack

Connect

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages