- π Hi, Iβm @thyphan2025
- π Iβm interested in AI & Machine Learning.
- π± Iβm currently pursuing Master of Science in Data Analytics Engineering at George Mason University
- π Pronouns: she/her
- β‘ Fun fact: I love exploring different cultures, especially their amazing foods.
- β Motivation quote : "I have no special talents. I am only passionately curious." - Albert Einstein
- Data Analytics Project (Capstone)
- Building small passion projects to explore data workflows and new tools
- Reading Designing Machine Learning Systems by Chip Huyen
- Reading Machine Learning Systems by Prof. Vijay Janapa Reddi - Harvard University
- Reading Fairness and Machine Learning by Solon Barocas, Moritz Hardt, Arvind Narayanan
- Starting MLOps Zoomcamp course
Python, PySpark, Databricks
- Cleaned and reshaped a multi-state bridge dataset to examine material and design patterns and applied association rule mining to identify recurring relationships.
β Bridge-Material-and-Design-Analysis
Power BI
- Explored multi-season influenza data to monitor trends, subtype distribution, and outbreak severity through an interactive dashboard.
β Influenza Surveillance Dashboard Chicago
R, Time-Series Analysis, Interactive Plot, Forecasting
- Cleaned and analyzed multi-year air quality data to examine environmental risk patterns and forecast ozone trends using ARIMA model.
- Published interactive HTML report with code, Plotly visualizations, and a few static plots.
β New York Air Quality Analysis
Python, Data Analysis, Machine Learning
- Analyzed electric vehicle adoption data to examine growth trends, geographic distribution, and vehicle characteristics across regions.
- Trained a Decision Tree Model to classify between Battery Electric Vehicles (BEVs) and Plug-in Hybrid Electric Vehicles (PHEVs)
- Utilized Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance.
Python, SQL, R, NLP
- Cleaned and analyzed global incident data to identify geographic hotspots, severity patterns, and recurring risk signals affecting education infrastructure.
- Applied natural language processing (NLP) to extract sentiment and patterns from incident descriptions.
β Education-in-Danger-Incidents
Python, PySpark, Spark MLib, Databricks
- Contributed code to the PySpark modeling workflow in Databricks, including feature engineering and evaluation using Python, PySpark and Spark MLlib.