Hemant Kumar B K HemantBK

Hemant Kumar B K

ML Engineer building production-grade AI systems with safety at the core. Currently researching Multi-Agent RL for cybersecurity at the University of Arizona and co-authoring StepShield — a safety benchmark for autonomous code agents (submitted to ICML 2026). Previously built recommendation engines at Escape LLC (30% engagement lift) and agentic RAG chatbots at Omdena (95% reduction in harmful responses).

I don't treat AI safety as a checkbox — I treat it as an engineering discipline.

🔬 Research

🛡️ StepShield — Co-Author

First benchmark for evaluating when autonomous code agents go rogue — not just whether they do. Detects specification violations (data exfiltration, unauthorized access) in real-time across 9,213 agent trajectories. Early detection cuts monitoring costs by 75% (~$108M projected savings).

Python PyTorch LLM Safety Red-Teaming Autonomous Agents

🚀 Featured Projects

💰 Dynamic Pricing Engine

Production-grade ML pricing system

XGBoost demand forecasting + price elasticity estimation + scipy revenue optimization. FastAPI serving, Streamlit dashboard, MLflow tracking, Evidently drift monitoring.

Python XGBoost FastAPI MLflow Streamlit

🗣️ AI Voice Assistant

Full-stack speech pipeline: STT → LLM → TTS

End-to-end voice assistant with FastAPI backend, React frontend, and Docker containerization. Speech-to-Text, LLM reasoning, and Text-to-Speech in one pipeline.

JavaScript FastAPI React Docker LLM

🌐 Multilingual Sentiment & Emotion Engine

5 languages + Hindi-English code-switching

Multi-task XLM-RoBERTa with LoRA adapters, ONNX INT8 inference, and cross-lingual transfer. Production-grade multilingual NLP pipeline.

Python XLM-RoBERTa LoRA ONNX NLP

📈 AI-Driven Algorithmic Trading

Sentiment-aware stock prediction system

Combines NLP sentiment analysis on financial headlines with quantitative indicators. TimeGPT predictions + Power BI dashboard. 20% higher prediction accuracy.

Python NLP TimeGPT Sentiment Analysis