Skip to content
View Rotha-101's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report Rotha-101

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Rotha-101/README.md
Cover

CHEA ROTHA

Typing SVG

Building end-to-end intelligent systems β€” from raw data to real-world impact


LinkedIn GitHub Email Location


🧠 About Me

Data Scientist & ML Engineer with an Engineering degree in Applied Mathematics and Statistics from the Institute of Technology of Cambodia (ITC). I specialize in end-to-end ML system development β€” from feature engineering to scalable model deployment β€” with real-world impact across energy systems, macroeconomic forecasting, NLP, and policy analytics.

  • πŸ”‹ Currently engineering Battery Energy Storage Systems (BESS) data pipelines at SchniecTech Group
  • πŸ“ˆ Built a hybrid inflation forecasting model (XGBoost + LSTM + SARIMAX) for the Cambodian Ministry of Planning
  • πŸŽ“ Former Data Science Instructor β€” taught ML, EDA, and Power BI to the next generation of analysts
  • 🌏 Passionate about applying AI to Southeast Asian development challenges

πŸ› οΈ Tech Stack

πŸ’» Programming & Tools

Python SQL R Java C# React MATLAB

πŸ€– Machine Learning & AI

TensorFlow Keras Scikit-learn XGBoost LightGBM

πŸ“Š Data & Visualization

Pandas NumPy Plotly Matplotlib Power BI Tableau

πŸ—£οΈ NLP

Hugging Face SpaCy NLTK

πŸ—„οΈ Databases & DevOps

MySQL PostgreSQL MongoDB Docker Git VS Code Jupyter


🧩 Coding Habits & Work Style

Area Tools Focus
πŸ”¬ Research & Experiments Jupyter Notebook Β· Google Colab EDA, model prototyping, thesis work
🏭 Production Pipelines Python · Docker · PostgreSQL Data ingestion, transformation, ETL
πŸ“Š Dashboards & Reports Power BI Β· Plotly Β· Data Studio Business insights, automated reporting
πŸ”‹ Energy Data Systems Python Β· EMS/SCADA Β· Pandas Real-time grid analytics, SOC monitoring
πŸ—£οΈ NLP Experiments Hugging Face Β· SpaCy Β· LoRA Tokenization, fine-tuning, Khmer NLP
πŸŽ“ Teaching & Mentoring Python Β· Power BI Β· Slides Curriculum design, student projects

Methodology Approach Mindset Domain


🌟 Key Highlights at a Glance


πŸ—“οΈ My GitHub Journey

Mar 2023  ──●  First commit β€” started exploring data science on GitHub
Apr 2023  ──●  Uploaded early EDA notebooks on tourism & agriculture data
Jun 2023  ──●  Published Laptop Price Prediction project (SVR winner!)
Sep 2023  ──●  Hotel Reservation DB β€” first SQL schema design on GitHub
Jan 2024  ──●  Joined Sunrise Institute β€” started teaching & documenting
Mar 2024  ──●  Crop Yield Prediction β€” XGBoost + Random Forest pipeline
Jul 2024  ──●  Cambodia Tourism Forecasting β€” ARIMA / SARIMA / Prophet
Oct 2024  ──●  Deep dive into NLP β€” tokenization & Khmer language corpus
Feb 2025  ──●  Ministry of Planning β€” Inflation Forecasting Thesis begins
Apr 2025  ──●  Hybrid ML deployed: XGBoost + LSTM + SARIMAX live
Dec 2025  ──●  SchniecTech β€” EMS/SCADA real-time data engineering begins
Apr 2026  ──◉  Today β€” actively building, learning, and contributing πŸš€

πŸ”§ Tools & Environments I Love

Jupyter VS Code Google Colab Anaconda Git GitHub Docker Power BI Tableau Looker Studio Google Sheets Excel


🌐 Languages

Language Level Context
πŸ‡°πŸ‡­ Khmer Native Mother tongue
πŸ‡¬πŸ‡§ English Professional Academic, work, research writing
πŸ‡«πŸ‡· French Basic Aii Language Center (2019–2022)

πŸ’Ό Professional Experience

πŸ”‹ Battery Energy Storage System Engineer & Data Analyst β€” SchniecTech Group

Dec 2025 – Present

  • Analyzed high-frequency EMS/SCADA time-series data (Active Power, Frequency, SOC, Voltage, Reactive Power)
  • Built real-time interactive dashboards for grid and plant performance monitoring
  • Conducted anomaly detection for power fluctuations, voltage deviations, and system faults
  • Optimized battery charge/discharge cycles by monitoring State of Charge (SOC) patterns
  • Automated daily operational reports for engineering and management decision-making

πŸ“ˆ Data Scientist β€” Ministry of Planning, Cambodia

Feb 2025 – Oct 2025

  • Engineered a hybrid inflation forecasting system (XGBoost + LSTM + SARIMAX) to enhance national economic projections
  • Built end-to-end data pipelines for macroeconomic indicators including cleaning, feature engineering & stationarity testing
  • Analyzed global commodity factors (oil, gold) and domestic sector drivers affecting Cambodian inflation
  • Delivered interactive dashboards and policy reports adopted in economic planning decisions

πŸŽ“ Instructor, Data Science β€” Sunrise Institute

Jan 2024 – Feb 2025

  • Taught EDA, statistics, ML, forecasting, and data visualization using Python and Power BI
  • Designed hands-on mini-projects bridging theory with real-world datasets
  • Mentored students on data storytelling and insight communication to technical and non-technical audiences

πŸš€ Featured Projects

🌑️ Inflation Forecasting in Cambodia (Thesis)

A hybrid ML system to forecast national inflation with improved accuracy over traditional models.

  • Approach: XGBoost + LSTM + SARIMAX ensemble β€” combining classical time series with deep learning
  • Data: Macroeconomic indicators, global oil & gold prices, domestic sector indices
  • Impact: Improved forecast accuracy for national economic planning at the Ministry of Planning
  • Tools: Python TensorFlow Statsmodels XGBoost Pandas Plotly

⚑ BESS / EMS Time-Series Analytics

Real-time analytics platform for Battery Energy Storage System operations.

  • Approach: High-frequency signal analysis (SOC, Frequency, Active Power) with anomaly detection
  • Impact: Enabled proactive grid instability detection and optimized charge/discharge efficiency
  • Tools: Python Pandas Power BI EMS/SCADA data Matplotlib

🌏 Tourism Forecasting in Cambodia

Time-series forecasting of tourist arrivals with post-COVID-19 recovery trend analysis.

  • Approach: Evaluated ARIMA, SARIMA, Prophet, and LSTM; selected best performer via MSE and residual diagnostics
  • Impact: Identified seasonal recovery patterns β†’ delivered data-driven recommendations to policymakers
  • Tools: Python ARIMA SARIMA Prophet LSTM Pandas Matplotlib

🌾 Crop Yield Prediction & Recommendation System

ML system predicting agricultural yield and recommending optimal crop selection.

  • Approach: Random Forest & XGBoost on soil and weather features; recommendation engine built on top
  • Metrics: Evaluated using RΒ², MAE, RMSE
  • Tools: Python Scikit-learn XGBoost Pandas NumPy Seaborn

πŸ’» Laptop Price Prediction

Regression pipeline to predict laptop prices from hardware specifications.

  • Approach: Compared Linear Regression, Random Forest, and SVR β€” SVR delivered best accuracy
  • Pipeline: Web scraping β†’ EDA β†’ Feature Engineering β†’ Model Training β†’ Evaluation
  • Tools: Python BeautifulSoup Scikit-learn Pandas Matplotlib Seaborn

🏨 Hotel Reservation System Database

Relational database system for end-to-end hotel operations management.

  • Scope: Room booking, client management, staff scheduling, payment confirmation
  • Design: ER diagrams, normalized relational schemas, primary/foreign key constraints
  • Tools: MySQL SQL ERD Design Relational Modeling

πŸŽ“ Education

🏫 Institute of Technology of Cambodia (ITC) β€” 2020 – 2025

Engineering Degree in Data Science Β· Major: Applied Mathematics and Statistics πŸ“„ Thesis: Analysis and Forecasting of Inflation in Cambodia

🌐 Aii Language Center β€” 2019 – 2022

English Β· French (Basic)


πŸ“¬ Get In Touch

Typing SVG


Gmail
LinkedIn GitHub
WhatsApp Location

Open to Work Β  Response Time



contribution snake animation




"Turning complex data into decisions that matter."


Profile Views Β  Made with

Pinned Loading

  1. Data-Science-Project Data-Science-Project Public

    Machine Learning and Deep Learning

    Jupyter Notebook

  2. Complete-Data-Science-Roadmap Complete-Data-Science-Roadmap Public

    Forked from Devparihar5/Complete-Data-Science-Roadmap

    Complete Roadmap For Data Science

    Jupyter Notebook

  3. Complete-Python-3-Bootcamp Complete-Python-3-Bootcamp Public

    Jupyter Notebook

  4. Data-science-ML-and-DL-Resources Data-science-ML-and-DL-Resources Public

    HTML

  5. Machine-Learning-Complete- Machine-Learning-Complete- Public

  6. Machine-Learning-Projects Machine-Learning-Projects Public