AI Engineer building autonomous AI systems, RAG evaluation tools, ML search platforms, and multimodal safety AI.
I build production-style AI systems with a focus on measurable performance, reproducibility, reliability, evaluation, and deployment.
Currently building Guardian Drive, a multimodal driver impairment intelligence system for autonomous vehicles using physiological signals, computer vision, BEV perception, CARLA simulation, CUDA acceleration, and real-time safety alerts.
- Autonomous driving perception and safety AI
- Multimodal driver impairment monitoring
- Agentic RAG and adversarial RAG evaluation
- ML search, recommendation, and learning-to-rank systems
- MLOps, observability, latency benchmarking, and reproducible evaluation
Iβm actively seeking GPU compute support for open-source autonomous driving and safety AI research, especially for CARLA simulation, CUDA acceleration, BEV perception, and real-time multimodal inference.
Camera-only BEV occupancy prediction + GPT-style trajectory estimation + self-supervised camera trust scoring.
OpenDriveFM is a production-style autonomous driving perception system that predicts BEV occupancy, future ego trajectory, and per-camera reliability from multi-camera input. The key differentiator is a self-supervised camera trust scorer that detects degraded sensors without fault labels.
- Highlights: camera-only AV perception, BEV occupancy, trajectory forecasting, sensor trust scoring
- Performance: 317 FPS, p50 latency 3.15ms, p95 latency 3.22ms, ADE 2.457m, IoU 0.136
- Reliability: 100% fault detection across 7 fault types
- Stack: Python, PyTorch, Lightning, nuScenes, Gradio, C++, Apple Silicon/MPS
Repo: https://github.com/AKilalours/opendrivefm
Netflix-style two-stage search, ranking, recommendation, and GenAI explanation system.
StreamLens is a full ML search and recommendation platform covering ingestion, retrieval, reranking, online serving, multilingual explanations, feedback loops, observability, and model-quality gates.
- Highlights: learning-to-rank, dense retrieval, recommendation systems, multilingual GenAI explanations
- Performance: LTR nDCG@10 = 0.9300, p95 latency = 98ms, p99 latency = 142ms
- Scale: 33.8M ratings, 44 languages, 106 API endpoints, 21 ML algorithms
- MLOps: Airflow DAGs, Metaflow flows, quality gates, rollback strategy, Prometheus monitoring
- Stack: Python, FastAPI, Redis, Kafka, Kubernetes, PySpark, SQL, LightGBM, FAISS, RAGAS
Repo: https://github.com/AKilalours/streaming-canvas-search-ltr
Real-time driver monitoring and autonomous vehicle safety platform combining physiology, vision, perception, and dispatch automation.
Guardian Drive fuses ECG cardiac screening, physiological drowsiness detection, camera-based driver monitoring, BEV perception, visual odometry, SLAM-style mapping, GPS routing, hospital dispatch, Discord alerts, and voice warnings into one safety-critical AI system.
- Highlights: multimodal safety AI, physiological monitoring, driver state detection, emergency routing
- ML Components: WESAD drowsiness model, PTB-XL ECG parser, nuScenes BEV perception, GPT-2 waypoint transformer
- Deployment: FastAPI WebSocket server, CoreML conversion, ONNX export, real-time dashboard
- Stack: Python, FastAPI, PyTorch, CoreML, ONNX, OpenCV, MediaPipe, OpenStreetMap, Discord API
Repo: https://github.com/AKilalours/guardian-drive
Fully local speech-to-speech translator and document-grounded RAG assistant with zero cloud dependency.
This project demonstrates offline AI deployment with speech recognition, retrieval, reranking, local LLM inference, text-to-speech, security controls, RBAC enforcement, and latency benchmarking.
- Highlights: offline ASR + RAG + TTS, local LLM inference, document-grounded answers
- Performance: dense retrieval p95 ~10ms, dense + rerank p95 ~50ms, RAG LLM p95 ~4.5s
- Quality: Recall@k = 1.000 across evaluation queries
- Security: refusal rate 1.000, RBAC enforcement 1.000
- Cost: $0.00/request, fully local with no API calls
- Stack: Faster-Whisper, Ollama, Mistral, Coqui TTS, FAISS, Cross-Encoder, FastAPI, pytest
Repo: https://github.com/AKilalours/local-offline-voice-translator-rag-assistant
Compound AI learning assistant with LangGraph orchestration, semantic caching, RAG evaluation, and production monitoring.
NeuraPilot is an agentic RAG pipeline designed as a production-grade AI tutor. It combines query classification, HyDE rewriting, MMR retrieval, BM25 reranking, intent-routed generation, semantic caching, and observability dashboards.
- Highlights: agentic RAG, LangGraph orchestration, semantic cache, real-time observability
- Performance: p50 latency ~1.8s, p95 latency ~4.2s, p99 latency ~7.1s
- Quality: RAGAS faithfulness 0.81, answer relevance 0.78, hit@10 proxy 0.83
- Efficiency: ~38% cache hit rate, ~42% latency reduction through caching
- Stack: Python, LangGraph, FastAPI, Streamlit, Redis, SQLite, Ollama, RAGAS
Repo: https://github.com/AKilalours/neurapilot
- Autonomous AI Systems: built camera-only BEV perception, trajectory prediction, sensor trust scoring, and multimodal safety pipelines.
- RAG / Retrieval: built evaluated RAG systems with FAISS, reranking, RAGAS, grounded citations, refusal behavior, and RBAC enforcement.
- Search / Ranking: implemented BM25, dense retrieval, hybrid search, LightGBM LambdaRank, cross-encoder reranking, and recommendation pipelines.
- Applied ML: trained and evaluated models across AV perception, recommendation systems, physiological signals, ranking, and NLP pipelines.
- MLOps / Reliability: designed systems with latency SLOs, quality gates, observability dashboards, rollback paths, API serving, and reproducible evaluation.
- Edge AI: deployed local/offline inference pipelines using Ollama, Apple Silicon/MPS, CoreML, ONNX, CPU retrieval, and zero-cloud-cost workflows.
- Production Mindset: projects include measurable metrics, failure-mode handling, security controls, postmortems, CI, and system-level documentation.

