📦 Logistics Analytics Platform – End-to-End Azure Data Engineering Project

This project demonstrates a modern data engineering workflow using the Medallion Architecture (Bronze → Silver → Gold) to deliver operational logistics insights.
It's designed for enterprise-scale logistics or transport companies to monitor shipment performance, delays, and fleet/vendor KPIs using:

🚛 Azure Data Factory
🚀 Azure Databricks + PySpark
📁 Delta Lake + Parquet
📊 Power BI for Visualization

🧱 Architecture

📁 Folder Structure

logistics-analytics-platform/ │ ├── data/ # Dummy CSVs (drivers, vendors, routes, shipments) ├── notebooks/ # Databricks Notebooks for each layer ├── adf_pipelines/ # ADF JSON definitions ├── powerbi/ # Power BI screenshots or .pbix files ├── architecture/ # Architecture diagram image ├── README.md # This file └── .gitignore

📊 Power BI Dashboard Features

✅ KPI Cards: Total Shipments, Avg Delay, On-Time %
📊 Vendor performance bar charts
📈 Monthly delivery trend lines
🗺️ Route-level delay matrix (simulated map)
🎯 Filters: Vendor, Route Type, Origin, Destination

📌 Pipeline Stages

🔹 Bronze Layer (Raw Ingestion)

Raw files: drivers.csv, vendors.csv, routes.csv, shipments.csv
Ingested using ADF Copy Activity with ForEach loop
Stored in Azure Data Lake under the bronze/ container

🔸 Silver Layer (Cleansed)

Field renaming and schema standardization
Converted timestamps and metrics (e.g., delay in minutes)
Stored as partitioned Parquet files in silver/ container

🟡 Gold Layer (Aggregated)

Enriched metrics like:
- On-Time %
- Delay by route
- Vendor KPIs
- Monthly trends
Written to gold/ container, ready for Power BI

🧠 Skills Demonstrated

✅ Metadata-driven ADF pipelines (parameterized + looped)
🧠 Databricks data transformation using PySpark
🗃️ Medallion architecture (Bronze → Silver → Gold)
💾 Delta Lake & partitioned Parquet files
📅 Incremental + batch pipeline logic
📈 Clean and professional Power BI dashboards
🔐 Secure handling of storage & access configuration

📤 How to Reproduce

Clone this repo
Upload /data/ CSVs to Azure Data Lake Gen2 (raw container)
Import and run ADF pipelines from /adf_pipelines/
Execute transformation notebooks in /notebooks/ inside Databricks
Open Power BI file from /powerbi/ and connect to gold container

📩 Contact

📧 Email: rao.mohsin.54@gmail.com
🌐 LinkedIn
✍️ Medium Profile

⭐ Star This Repo

If you found this project useful, give it a ⭐ on GitHub — and feel free to fork or adapt it!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📦 Logistics Analytics Platform – End-to-End Azure Data Engineering Project

🧱 Architecture

📁 Folder Structure

📊 Power BI Dashboard Features

📌 Pipeline Stages

🔹 Bronze Layer (Raw Ingestion)

🔸 Silver Layer (Cleansed)

🟡 Gold Layer (Aggregated)

🧠 Skills Demonstrated

📤 How to Reproduce

📩 Contact

⭐ Star This Repo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
adf_pipelines		adf_pipelines
architecture		architecture
data		data
notebooks		notebooks
powerbi		powerbi
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

📦 Logistics Analytics Platform – End-to-End Azure Data Engineering Project

🧱 Architecture

📁 Folder Structure

📊 Power BI Dashboard Features

📌 Pipeline Stages

🔹 Bronze Layer (Raw Ingestion)

🔸 Silver Layer (Cleansed)

🟡 Gold Layer (Aggregated)

🧠 Skills Demonstrated

📤 How to Reproduce

📩 Contact

⭐ Star This Repo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages