Project Overview
This system provides an end-to-end solution for collecting, processing, storing, and visualizing reporting data. The workflow begins with data collection via Excel forms, processes the data using Python scripts, stores it in a PostgreSQL database, exports required datasets, and finally creates interactive dashboards using Power BI.
System Architecture
βββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β β β β β β β β
β Excel Forms βββββΊβ Python Data βββββΊβ PostgreSQL βββββΊβ Python Export βββββΊβ Power BI β
β (Data Input) β β Processing β β Database β β Scripts β β Dashboards β
β β β Scripts β β (Storage) β β β β β
βββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββTechnologies Used
- Data Collection: Microsoft Excel (.xlsx files)
- Data Processing: Python 3.8+ with pandas, openpyxl, SQLAlchemy
- Database: PostgreSQL 12+
- Data Export: Python scripts with pandas
- Visualization: Microsoft Power BI Desktop
- Version Control: Git/GitHub
Installation & Setup
Prerequisites
-
Python 3.8 or higher
-
PostgreSQL 12 or higher
-
Microsoft Excel (for form creation)
-
Power BI Desktop
-
Git
1.Clone the repository:
git clone https://github.com/mscbuild/reporting-system.git
cd reporting-collection-system2.Create and activate virtual environment:
python -m venv venv
# Windows
venv\Scripts\activate
# Linux/MacOS
source venv/bin/activate3.Install Python dependencies:
pip install -r requirements.txt4.Set up PostgreSQL database:
CREATE DATABASE reporting_system;
CREATE USER reporting_user WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE reporting_system TO reporting_user;5.Configure environment variables:
Create a .env file in the project root:
DB_HOST=localhost
DB_PORT=5432
DB_NAME=reporting_system
DB_USER=reporting_user
DB_PASSWORD=secure_password
EXCEL_INPUT_DIR=./data/input
EXPORT_OUTPUT_DIR=./data/exportreporting-system/
βββ docs/ # Documentation files
β βββ architecture.md
β βββ user_guide.md
βββ data/
β βββ input/ # Excel input files
β βββ export/ # Exported CSV/Excel files for Power BI
βββ src/
β βββ config/
β β βββ database.py # Database configuration
β βββ ingestion/
β β βββ excel_processor.py # Excel processing scripts
β βββ storage/
β β βββ db_loader.py # Database loading scripts
β βββ export/
β β βββ data_exporter.py # Data export scripts
β βββ utils/
β βββ helpers.py # Utility functions
βββ tests/ # Unit tests
βββ requirements.txt # Python dependencies
βββ .env.example # Environment variables template
βββ .gitignore
βββ README.md
βββ powerbi/ # Power BI files and documentation
βββ dashboards/
βββ documentation/1. Data Collection (Excel Forms)
- Use the provided Excel template
(templates/reporting_template.xlsx)to collect data - Save completed forms in the
data/input/directory - Ensure all required fields are filled according to the data dictionary
2. Data Processing and Loading
Run the data processing pipeline:
# Process Excel files and load to database
python src/ingestion/excel_processor.py
python src/storage/db_loader.pyOr run the complete pipeline:
python run_pipeline.py3. Data Export
Export required datasets for Power BI:
# Export all reports
python src/export/data_exporter.py --all
# Export specific report for date range
python src/export/data_exporter.py --report monthly_summary --start-date 2025-01-01 --end-date 2025-09-304. Power BI Dashboard Creation
1.Open Power BI Desktop
2.Import exported CSV/Excel files from data/export/
3.Create relationships between tables as needed
4.Build visualizations and dashboards
5.Save .pbix file in powerbi/dashboards/
Excel Processing (src/ingestion/excel_processor.py)
Database Loading (src/storage/db_loader.py)
Data Export (src/export/data_exporter.py)
Database Schema sgl
1.Data Refresh: Configure Power BI to automatically refresh from the export directory
2.Parameters: Create date parameters for dynamic period selection
3.Measures: Implement DAX measures for KPIs and calculations
4.Visualizations:
-
Time series charts for trend analysis
-
Bar charts for departmental comparisons
-
Tables for detailed data review
Run unit tests:
python -m pytest tests/For production deployment:
1.Set up scheduled tasks (cron jobs or Windows Task Scheduler) to run the pipeline daily
2.Configure database backups
3.Implement error logging and monitoring
4.Set up Power BI service for cloud-based dashboard sharing
1.Fork the repository
2.Create a feature branch
3.Commit your changes
4.Push to the branch
5.Create a Pull Request
This project is licensed under the MIT License
Note
Remember to update the .env.example file with your actual configuration and never commit sensitive information to version control.