Lead Data Engineer | AI Data Platform Modernization | LLM-Assisted ETL & BI Migration | Azure | Power BI | IDMC | OAS
Iβm a data engineering and data platform modernization professional with 13+ years of experience building scalable cloud data platforms, lakehouse architectures, ETL/ELT pipelines, BI modernization programs, and analytics systems across Azure, Databricks, Snowflake, Oracle, Power BI, and AWS.
My recent focus is on modernizing legacy enterprise data ecosystems β including Informatica, IDMC, OBIEE, OAS, SSIS, Azure Data Factory, and Power BI β using practical LLM-assisted workflows for migration analysis, SQL refactoring, mapping documentation, test case generation, validation, and reporting modernization.
I enjoy building systems that turn complex legacy data environments into scalable, governed, AI-ready platforms that reduce manual effort, improve trust, and accelerate business decision-making.
- AI Data Platform Modernization: LLM-assisted ETL migration, legacy-to-cloud modernization, AI-assisted validation, migration documentation
- Cloud Data Platforms: Azure Data Factory, Azure Databricks, Azure Synapse, Snowflake, AWS, IDMC
- Data Engineering: Python, SQL, PySpark, Spark SQL, ETL/ELT, batch pipelines, orchestration
- BI & Analytics Modernization: Power BI, Oracle Analytics Server, OBIEE, BI Publisher, semantic models
- Lakehouse & Warehouse Architecture: Delta Lake, medallion architecture, dimensional modeling, curated data products
- Governance & Reliability: data quality, source-to-target validation, monitoring, alerting, lineage, compliance-ready delivery
Production-style SaaS platform that converts legacy ETL pipeline definitions into cloud-native workflow artifacts.
Built to solve a real enterprise modernization challenge: understanding and converting legacy ETL logic buried inside Informatica-style XML before migrating to platforms like Azure Data Factory, Airflow, Databricks Workflows, Dagster, Prefect, and AWS Glue.
Key capabilities:
- Parses legacy ETL XML and metadata from Informatica, SSIS, Talend, DataStage, and Ab Initio
- Converts parsed metadata into a platform-neutral canonical JSON model
- Generates first-draft target artifacts for ADF, Airflow, Databricks, Dagster, Prefect, and AWS Glue
- Produces validation reports, gap analysis, unsupported logic detection, and remediation suggestions
- Tracks conversion runs, pipeline steps, errors, duration, and project history
- Uses LLM-assisted workflows for documentation, artifact generation, and migration analysis
Modeled business impact:
- Designed for enterprise-scale programs with 500+ workflows, 1,000+ mappings, and 2,000+ transformations
- Modeled 40-60% reduction in manual migration analysis and rewrite planning
- Modeled $450K+ cost avoidance through reduced consulting effort, faster assessment, and less rework
- Estimated AI token cost around <$1-$3 per workflow conversion, making the approach practical at scale
Tech Stack: Next.js, TypeScript, PostgreSQL, Prisma, Supabase, Stripe, OpenAI, Tailwind CSS
End-to-end AI SaaS platform that generates ATS-optimized resumes and cover letters.
Built with: Next.js, Supabase, Prisma, Stripe, and LLM workflows
Focus areas:
- ATS keyword matching
- Resume and cover letter generation
- AI-assisted content optimization
- SaaS product architecture
- Authentication, billing, and user workflow design
Production-style ETL pipeline using Airflow, Python, PostgreSQL, and Docker.
Features:
- API ingestion from OpenWeather
- Airflow TaskFlow API orchestration
- Dynamic task mapping
- Idempotent upserts
- Layered raw, staging, and curated tables
- Dockerized local development environment
Focus: Data ingestion, orchestration, transformation, and analytics-ready modeling
Real-time analytics pipeline using Kafka, Spark, Airflow, PostgreSQL, and Superset.
Focus areas:
- Streaming ingestion
- Distributed processing
- Near-real-time analytics
- End-to-end data pipeline design
- Operational dashboards
I like building data systems that:
- Modernize legacy platforms without losing business logic
- Reduce manual migration and reporting effort
- Improve data trust through validation and reconciliation
- Scale reliably across batch, streaming, and BI workloads
- Support business decisions with clean, governed, analytics-ready data
- Apply AI and LLMs where they create real engineering leverage
- Reduced ETL pipeline maintenance and support overhead by 25% through SSIS-to-ADF modernization
- Improved platform uptime to 99.9% through cloud data platform modernization
- Reduced infrastructure spend by 43% through Azure platform optimization
- Improved time-to-insight by 60% through curated lakehouse and warehouse data products
- Supported 40+ Power BI reports across 10+ business functions
- Modeled $450K+ cost avoidance with AI-assisted ETL modernization patterns
Languages: Python, SQL, PySpark, TypeScript
Cloud/Data: Azure Data Factory, Azure Databricks, Azure Synapse, Snowflake, AWS, IDMC
BI/Analytics: Power BI, Oracle Analytics Server, OBIEE, BI Publisher, Tableau
AI/LLM: OpenAI, LLM-assisted migration workflows, AI-assisted validation, Microsoft Copilot
Engineering: Airflow, Kafka, Spark, PostgreSQL, Prisma, Supabase, Docker, GitHub Actions, Terraform
Architecture: Lakehouse, Data Warehouse, Semantic Models, ETL/ELT, Data Quality, Governance