Skip to content
View Kishorsenthilkumar's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Kishorsenthilkumar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kishorsenthilkumar/README.md

Data Engineer  ·  ETL Pipelines  ·  Distributed Systems  ·  AI Orchestration

     


👨‍💻 About Me

🎓  B.Tech in Artificial Intelligence & Data Science — Panimalar Engineering College

🚀  Aspiring Data Engineer with hands-on experience building real-time data pipelines and cloud-based data platforms

💡  Passionate about solving real-world big data problems using scalable, distributed architectures

📊  Strong foundation in ETL/ELT pipelines, distributed data processing, and data modeling

🤖  Exploring AI Orchestration and LLM-powered data workflows using LangChain & MCP


🛠️ Tech Stack

Languages

Python Java SQL

Data Engineering

Apache Kafka Apache Spark Apache Hadoop Apache Hive Apache Airflow Docker Delta Lake

Databases

MySQL PostgreSQL MongoDB Cassandra

Cloud & Platforms

AWS Azure Snowflake Git

AI Orchestration & LLM Tools

LangChain LangFlow MCP

LangChain — Agents, Chains, RAG (Retrieval-Augmented Generation) LangFlow — Low-code LLM workflow orchestration MCP — Context-driven AI system integration


🚀 Featured Projects

🔹 Real-Time IoT Damage Monitoring Pipeline

Stack: Azure IoT Hub · PySpark Structured Streaming · PostgreSQL · Grafana

  • Ingested real-time sensor data (vibration, temperature, strain) via Azure IoT Hub
  • Processed streams with PySpark Structured Streaming for structural damage detection
  • Persisted results to PostgreSQL and built live monitoring dashboards in Grafana

🔹 Real Estate Lakehouse Pipeline

Stack: Delta Lake · Apache Airflow · SCD Type 2 · Medallion Architecture

  • Implemented Bronze–Silver–Gold medallion architecture for layered data quality
  • Built SCD Type 2 historical tracking for slowly-changing property dimensions
  • Automated end-to-end orchestration with Apache Airflow on Delta Lake (ACID-compliant)

🔹 Real-Time Clickstream Analytics Pipeline

Stack: Apache Kafka · Cassandra · Tableau · Schema Design

  • Simulated e-commerce user activity through a real-time Kafka streaming pipeline
  • Designed an analytics-ready schema optimized for high-throughput reads in Cassandra
  • Delivered behavioral insights and funnel metrics via Tableau dashboards

📊 GitHub Stats

 


📫 Let's Connect

📧 Email kishorsenthilkumar104@gmail.com
💼 LinkedIn linkedin.com/in/kishor-s-5b4332274
💻 GitHub github.com/Kishorsenthilkumar

Building scalable data systems · Exploring AI orchestration · Solving real-world problems with data

Pinned Loading

  1. DE-IOT-heavymachinary-monitoring DE-IOT-heavymachinary-monitoring Public

    Mirrors the architecture used by Hitachi Lumada, Bosch Connected Industry, and Siemens MindSphere for real-time heavy machinery damage detection.

    Python

  2. real-estate-lakehouse-pipeline real-estate-lakehouse-pipeline Public

    End-to-end data engineering pipeline that collects real estate listings, stores raw data in Azure Data Lake, processes it with Spark and Delta Lake, and detects property price changes using histori…

    Jupyter Notebook 1

  3. SalesPulse SalesPulse Public

    Forked from dataeng-begineers/SalesPulse

    Supermarket sale and stock data management project for ETL and Real-Time Analytics.

    Python

  4. Docker-workshop Docker-workshop Public

    Jupyter Notebook

  5. Building-Redis Building-Redis Public

    building own cache+streamingengine+DB

    Python