I build production-grade ML and GenAI systems across AWS and Azure, focused on shipping reliable models that solve real business problems.
My work spans LLM-based systems, classical ML, data pipelines, and MLOps, with experience delivering across energy, telecommunications, and technology sectors.
Target roles: Data Scientist, ML Engineer, AI Engineer Location: Raleigh, NC | Open to Remote and Hybrid Status: Actively seeking new opportunities
- Master's in Data Analytics, Northeastern University
- 3.5+ years building production ML systems across consulting and product teams
- Deployed 9 production ML pipelines on AWS SageMaker
- Azure Data Scientist Associate
- HashiCorp Certified: Terraform Associate
- IBM Data Science Professional Certificate
- Google Data Analytics Professional Certificate
These repositories reflect how I design and ship real-world AI systems, not notebook demos.
A self-correcting Text-to-SQL agent that converts natural language business questions into executable SQL, runs them against a real 9-table relational database, and automatically retries with error context when queries fail.
Live demo: https://huggingface.co/spaces/pranshu2230/sql-copilot
- Qwen2.5-Coder-7B via HF Inference API for SQL generation
- DuckDB in-process query engine with zero infrastructure overhead
- SQLGlot AST transformation for deterministic alias stripping and column fixing
- Self-correction loop: reads its own SQL errors and rewrites the query up to 3 times
- Data-agnostic: drop any CSV files in and the app queries them with no code changes
- Deployed via GitHub Actions to Hugging Face Spaces
Why this matters: demonstrates how to build reliable LLM-powered systems that handle failure explicitly rather than relying on happy-path-only generation.
View repository: https://github.com/pranshu1921/sql-copilot
Production RAG system for customer support over 150K+ product, review, and policy documents.
- LangChain-based retrieval with vector search and source-grounded answers
- Sub-10ms cached responses, cold queries under 1.5 seconds
- 85%+ retrieval accuracy on domain-specific test queries, 60-80% cache hit rate
- Evaluation harness with 15+ domain-specific test queries
- Dockerized FastAPI backend with Streamlit UI
- API cost of approximately $0.50 per 1,000 queries
Why this matters: demonstrates how to build cost-aware RAG systems that remain debuggable and transparent in production.
View repository: https://github.com/pranshu1921/shopassist-rag
Two-stage e-commerce fraud detection system on 590K IEEE-CIS transactions. Combines supervised and unsupervised ML to catch both known fraud patterns and novel attacks no labeled data exists for yet.
- PyTorch Autoencoder trained exclusively on legitimate transactions. Learns normal behavior unsupervised. Flags novel fraud via reconstruction error before any labeled examples of that pattern exist.
- XGBoost classifier with temporal cross-validation (not random split, which causes data leakage on fraud data), scale_pos_weight for 3.5% class imbalance, and SHAP explainability per flagged transaction.
- Logistic meta-learner ensemble combines both scores. ROC-AUC 0.91 on held-out test set.
- Great Expectations data validation, MLflow experiment tracking, Evidently AI drift monitoring, FastAPI inference endpoint, Streamlit analyst dashboard.
Why this matters: mirrors the two-stage architecture used by production fraud teams at payments companies. Demonstrates that anomaly detection and classification solve different parts of the same problem and should be combined.
View repository: https://github.com/pranshu1921/fraud-detection-xgboost-autoencoder
Production-style RAG system using agentic reasoning over custom knowledge bases.
- LangGraph orchestration for multi-step reasoning and tool selection
- FAISS vector store for efficient semantic search
- ReAct-style agent with deterministic source citation and Wikipedia fallback
- Modular ingestion for URLs and local documents (PDF, TXT, DOCX)
- Includes real screenshots from working system runs
Why this matters: mirrors how modern AI teams build trustworthy internal knowledge systems that can be inspected, debugged, and extended.
View repository: https://github.com/pranshu1921/Agentic-RAG-Document-Search-System
End-to-end ML system emphasizing reproducibility, maintainability, and production correctness.
- Data ingestion, feature engineering, training, evaluation, and inference in one pipeline
- sklearn ColumnTransformer pipelines eliminate train-serve skew
- Churn classification model: 0.81 AUC on held-out test set
- FastAPI inference service with health check and prediction endpoints
- Config-driven execution via YAML, artifacts persisted to predictable locations
- CI-ready with GitHub Actions linting and smoke tests
Why this matters: shows how to take a model from notebook to production in a way teams can operate, extend, and trust.
View repository: https://github.com/pranshu1921/ml-production-pipeline
Business-first ML case studies that show how predictions become decisions.
- Churn modeling with cost-benefit thresholding: precision 0.74, recall 0.71 at selected operating point
- Fraud detection under class imbalance: threshold selected by expected dollar impact, preventing an estimated $18,400 in fraud loss per 10,000 transactions at a review cost of $620
- Precision-recall optimization prioritized over accuracy for imbalanced fraud data
- End-to-end pipelines producing reports, metrics, plots, and model artifacts from a single command
Why this matters: demonstrates how to translate ambiguous business problems into ML systems that drive real decisions, not just maximize a metric.
View repository: https://github.com/pranshu1921/applied-ml-case-studies
Generative AI and LLM Systems Building production RAG and agentic workflows with emphasis on evaluation, reliability, and source-grounded answers.
Applied Machine Learning Designing predictive models and experiments that solve concrete business problems, from churn and fraud to operational forecasting.
Production ML Systems Taking models from notebook to production through deployment, monitoring, CI/CD, and performance tuning.
Data Engineering for ML Designing SQL-first pipelines, feature stores, and data quality checks that make ML systems reliable at scale.
ML and AI Scikit-learn, XGBoost, LightGBM, PyTorch, LangChain, LangGraph, OpenAI API, Qwen, FAISS, ChromaDB, RAGAS
Data and Analytics Python, SQL, R, Pandas, NumPy, Polars, Great Expectations
MLOps and Infrastructure AWS (SageMaker, Glue, Lambda, S3, Athena), Azure (OpenAI, AI Search, DevOps), Apache Airflow, Terraform, Docker, GitHub Actions
Visualization and BI Tableau, Power BI, Streamlit, Plotly, Matplotlib, Seaborn
- Start with problem framing, not models
- Treat data pipelines as first-class systems
- Fix model errors deterministically in post-processing rather than hoping for better prompts
- Optimize for maintainability over cleverness
- Measure success using business impact
- Plan explicitly for failure modes
- Advanced RAG architectures: hybrid search, reranking, query decomposition
- LLM evaluation frameworks: RAGAS, domain-specific metrics
- Fine-tuning and adapting open-source models
- Production monitoring for LLM systems: cost, latency, hallucination detection
Seeking: Data Scientist, ML Engineer, and AI Engineer roles (Remote or Hybrid)
- LinkedIn: https://www.linkedin.com/in/pranshu-kumar
- GitHub: https://github.com/pranshu1921
- Email: pranshukumarpremi@gmail.com

