A dual-workflow data cleaning project using Python (Pandas) and SQL Server to standardize retail sales data, fix missing values, and validate transactions.
This project demonstrates a comprehensive data cleaning pipeline implemented in two different technologies: Python (Pandas) and SQL. The goal is to take raw, messy retail store sales data, identify inconsistencies, and transform it into a clean, structured format suitable for analysis.
├── Data/ # Raw CSV files (retail_store_sales.csv, Product_master.csv)
├── Scripts-python/ # Jupyter Notebooks for Python-based cleaning
│ └── Data-Cleaning.ipynb
├── Scripts-SQL/ # SQL scripts for database-based cleaning
│ ├── Data-import.sql
│ ├── Cleaning-Price-per-unit.sql
│ ├── Item-CLeaning.sql
│ ├── Cleaned-Discount-applied.sql
│ └── Save-Cleaned-Data.sql
└── README.md
