Skip to content
View gogoharrison's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report gogoharrison

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gogoharrison/README.md

Gogo Harrison Banner

Gogo Isaac Harrison

Data Scientist · Machine Learning Engineer · Analytics

LinkedIn Portfolio Medium Email


About Me

I'm a Data Scientist and Machine Learning Engineer focused on turning complex datasets into decisions that drive measurable business value. I build end-to-end ML systems — from raw data pipelines to deployed models — with a strong emphasis on analytical rigour and practical impact.

Currently deepening expertise in data engineering, MLOps, and scalable data pipelines.


Core Skills

Area Tools & Technologies
Languages Python, SQL
Machine Learning scikit-learn, XGBoost, K-Means, feature engineering, model evaluation
Deep Learning TensorFlow / Keras, Neural Networks, NLP
Data Engineering PySpark, Apache Airflow, ETL pipelines, Docker
Data & Analytics Pandas, NumPy, statistical modelling, EDA
Visualisation Matplotlib, Seaborn, Plotly, Metabase
Cloud & Infra Azure, PostgreSQL, Docker Compose
Explainability SHAP, model interpretability

Featured Projects

⚙️ Spark ETL Movie Analytics Pipeline

PySpark · Apache Airflow · PostgreSQL · Docker · Python

Built a production-grade, end-to-end data engineering pipeline processing 33.8 million MovieLens ratings. Orchestrated a 15-task Airflow DAG with MD5 hash-based change detection that cut weekly run time from 90 minutes to 2 minutes on unchanged data. Outputs a PostgreSQL star schema powering Metabase dashboards — running on a fully automated weekly schedule at zero cloud cost.

→ View Repository


🧠 Sentiment Analysis with Neural Networks

TensorFlow · Keras · NLP · Python

Developed an end-to-end deep learning classifier to categorise e-commerce customer reviews as positive or negative. Achieved ~91% validation accuracy using a TextVectorization + Embedding + Dense architecture trained on 11,158 labelled reviews. Demonstrated full ML lifecycle from preprocessing through inference, with a foundation suitable for real-time customer feedback monitoring.

→ View Repository


👥 Customer Segmentation Using K-Means Clustering

scikit-learn · K-Means · Python · EDA

Applied unsupervised machine learning to segment retail customers by purchasing behaviour and demographics. Used the elbow method and silhouette scoring (0.55) to identify 5 well-defined customer clusters, delivering actionable targeting strategies for high-income/low-spend vs. low-income/high-spend segments.

→ View Repository


🏗️ Building Energy Efficiency — Load Prediction & Explainability

scikit-learn · SHAP · Linear Regression · Python

Modelled heating and cooling energy demand from architectural design parameters. Achieved R² ~0.92 for heating load and ~0.89 for cooling load using interpretable linear regression. Applied SHAP analysis to confirm physical drivers (compactness, glazing area), producing stakeholder-ready, explainable outputs for early-stage design decisions.

→ View Repository


Open To

I'm actively looking for opportunities in:

  • Data Science — predictive modelling, experimentation, statistical analysis
  • Machine Learning Engineering — model development and deployment
  • Data Engineering — pipeline design, orchestration, scalable data systems
  • Analytics & Decision Science — business intelligence, insight generation

If you're building data-driven products and need someone who bridges strong analytical thinking with hands-on implementation, let's talk.

📧 gogoharrison66@gmail.com


GitHub Stats


Open to remote, hybrid, and on-site roles globally.

Pinned Loading

  1. sentiment-analysis-system sentiment-analysis-system Public

    The aim of this project is to build an end-to-end sentiment analysis solution that can extract, preprocess, analyze, and visualize customer sentiment from textual reviews. This solution will provid…

    Jupyter Notebook

  2. airflow-survey-data-pipeline airflow-survey-data-pipeline Public

    Containerized Apache Airflow ETL pipeline for ingesting survey responses from Excel, normalizing raw columns, transforming curated analytics-ready datasets, and loading bronze and silver layers int…

    Python

  3. spark-etl-movie-analytics spark-etl-movie-analytics Public

    Dockerized Spark ETL pipeline orchestrated with Airflow for scalable movie analytics. Ingests, transforms, and loads data using modular jobs and SQL, delivering reproducible workflows and insights …

    Python

  4. splendor-analytics-trial-activation splendor-analytics-trial-activation Public

    End-to-end trial activation analysis for a workforce management SaaS platform. Defines activation via 5 in-app behavioural goals, builds a PostgreSQL marts layer (mart_trial_goals + mart_trial_acti…

    Jupyter Notebook

  5. volve-field-dca-forecasting volve-field-dca-forecasting Public

    End-to-end well production forecasting pipeline on Equinor's Volve North Sea dataset. Implements Arps decline curve analysis (Exponential, Hyperbolic, Harmonic) with AIC model selection, 5-year EUR…

    Jupyter Notebook

  6. sentiment-analysis-neural-network sentiment-analysis-neural-network Public

    A deep learning–based sentiment analysis project that classifies customer reviews as positive or negative using neural networks and NLP techniques. The repository demonstrates end-to-end model deve…

    Jupyter Notebook