Data Analysis • Business Intelligence • Tableau • SQL
This was my learning project to practice Tableau and SQL
The analysis includes:
- Delivery timeliness and delay patterns
- Customer satisfaction impact analysis
- Seller performance segmentation
- Geographic distribution of issues
- Trend analysis over time
-
Delivery Performance Overview
- Global delay rates and trends
- Customer satisfaction correlation
- Key performance indicators
-
Seller Performance Segmentation
- Seller-tier performance comparison
- Delay rate distribution
- Geographic concentration analysis
Final dashboard images and Tableau workbook available in /03_dashboard/
| Dataset | Records | Description | File |
|---|---|---|---|
| Full Delivery Transactions | 200,000+ | Order and delivery timestamps | full_delivery_table.csv |
Note: Sorry, I can’t upload the Excel file because it exceeds GitHub’s 25MB file size limit.
Source: Kaggle (synthetic e-commerce dataset)
Processing was performed using SQL to standardize all datasets into a clean analytical model.
Key steps included:
- Timestamp standardization and validation
- Delay calculation and categorization
- Customer satisfaction scoring normalization
- Seller performance tier assignments
- Geographic region mapping
Data Validation:
- Removed incomplete records
- Validated timestamp logic consistency
- Ensured rating scale uniformity
- Verified seller ID integrity
- Global delay rate
- Average delay duration
- Customer satisfaction scores (on-time vs. delayed)
- Seller performance tiers
- Regional delay concentration
- Calculated delivery performance metrics using SQL
- Segmented sellers by performance levels
- Correlated delays with satisfaction scores
- Analyzed geographic and temporal patterns
- KPI performance tiles
- Trend analysis over time
- Seller performance distributions
- Geographic heat maps
- Satisfaction impact charts
✔ Delivery performance analysis
✔ Customer satisfaction correlation
✔ Seller performance segmentation
✔ Operational KPI tracking
✔ Geographic distribution analysis
❌ Forecasting future performance
❌ Customer demographic analysis
❌ Competitor benchmarking
❌ Cost impact analysis
The following insights were generated from the initial analysis using the raw dataset. After discovering data quality issues (documented below), these insights may not be fully reliable. They are presented here as part of the learning process — to demonstrate analytical thinking while acknowledging data limitations.
UrbanZen's global delay rate of 9.57% directly impacts customer satisfaction, with delayed orders scoring 1.71 points lower than on-time deliveries.
Implication: Delivery timeliness is a critical driver of customer experience and requires immediate attention.
While top performers maintain >95% on-time rates, the worst-performing sellers show 23-24% delay rates, indicating significant operational inconsistencies.
Implication: Targeted seller management and performance standards are needed to address the performance gap.
The average delay duration of 10 days exceeds reasonable customer expectations, explaining the substantial satisfaction drop.
Implication: Both prevention and communication strategies are needed for delayed orders.
Delivery issues are seller-driven rather than region-specific, pointing to operational rather than logistical challenges.
Implication: Improvement efforts should focus on seller operations rather than geographic optimization.
Initial analysis suggested that addressing the top 10% of underperforming sellers could reduce delays by 40%+. However, this finding is based on a dataset later found to have quality issues (see Data Quality section below). This insight should be validated with cleaner data before any operational decision.
After completing the dashboard and analysis, I identified significant data quality issues that affect the validity of the insights above:
- Seller status unknown: The dataset contained 100 sellers with delay records, but no information on whether these sellers were still active or had already been deactivated by the platform. This makes the "worst seller" ranking potentially invalid.
- Missing timestamp validation: The dataset lacked verification for order status change logic (e.g., delivered date before shipped date in some records).
- Incomplete geographic data: Some seller regions had incomplete or inconsistent naming, affecting the accuracy of geographic concentration analysis.
This project taught me more from its failure than my successful projects did:
- Data validation is Step Zero, not Step One — I should have profiled the data using SQL queries BEFORE building the dashboard
- Always ask critical questions — Is this data complete? What context is missing? What assumptions must I verify?
- A beautiful dashboard with bad data is still a bad dashboard — Visual polish does not compensate for data quality issues
- Kaggle datasets are not always production-ready — They require the same rigor as any other data source
*
This project is preserved as a learning artifact — to show my ability to reflect on mistakes and extract lessons from failure. The dashboard itself is not recommended for operational decision-making without first addressing the data quality issues documented above.
- SQL transformation scripts
- Tableau Dashboard
All deliverables organized in labeled project folders.
Tableau – Dashboarding and visualization
SQL – Data extraction and transformation
Excel – Initial data validation
Notion – Project documentation
Hi, I’m Syahraini, transitioning into Business Analysis from an accounting background. I specialize in building clear, executive-ready dashboards and turning complex datasets into practical insights.
- LinkedIn: https://linkedin.com/in/nsyahraini
- Portfolio: https://syahrainiaini.framer.website/
- Email: mailto:syahraini.nur@outlook.com
Project completed as part of professional portfolio development.