Data Engineer · ETL Pipelines · Distributed Systems · AI Orchestration
🎓 B.Tech in Artificial Intelligence & Data Science — Panimalar Engineering College
🚀 Aspiring Data Engineer with hands-on experience building real-time data pipelines and cloud-based data platforms
💡 Passionate about solving real-world big data problems using scalable, distributed architectures
📊 Strong foundation in ETL/ELT pipelines, distributed data processing, and data modeling
🤖 Exploring AI Orchestration and LLM-powered data workflows using LangChain & MCP
LangChain — Agents, Chains, RAG (Retrieval-Augmented Generation) LangFlow — Low-code LLM workflow orchestration MCP — Context-driven AI system integration
Stack: Azure IoT Hub · PySpark Structured Streaming · PostgreSQL · Grafana
- Ingested real-time sensor data (vibration, temperature, strain) via Azure IoT Hub
- Processed streams with PySpark Structured Streaming for structural damage detection
- Persisted results to PostgreSQL and built live monitoring dashboards in Grafana
Stack: Delta Lake · Apache Airflow · SCD Type 2 · Medallion Architecture
- Implemented Bronze–Silver–Gold medallion architecture for layered data quality
- Built SCD Type 2 historical tracking for slowly-changing property dimensions
- Automated end-to-end orchestration with Apache Airflow on Delta Lake (ACID-compliant)
Stack: Apache Kafka · Cassandra · Tableau · Schema Design
- Simulated e-commerce user activity through a real-time Kafka streaming pipeline
- Designed an analytics-ready schema optimized for high-throughput reads in Cassandra
- Delivered behavioral insights and funnel metrics via Tableau dashboards
| kishorsenthilkumar104@gmail.com | |
| linkedin.com/in/kishor-s-5b4332274 | |
| 💻 GitHub | github.com/Kishorsenthilkumar |
Building scalable data systems · Exploring AI orchestration · Solving real-world problems with data


