One of my first data science projects involved various machine learning models to predict the accuracy of heart disease using patient data.
About This Project: This project served as one of my first data science projects involving the construction of machine learning models based on patient data. It was originally completed on 12/10/2024 for the course "Introduction to Mathematics for Machine Learning." After much consideration, I have decided to upload the various models alongside the project.
To run these models, have a working version of Anaconda/Miniconda with Python 3.10 installed. check environment.yml for further packages and their dependencies.
The models here feature 5 machine learning algorithms:
- Naive Bayes
- EXtreme Gradient Boosting (XGBoost)
- Support Vector Machines (SVM)
- Logistic Regression
- K-Nearest Neighbors (KNN)
At the time of writing, I have significantly more knowledge about machine learning algorithms and the surrounding mathematics. That being said, many of these models could be significantly improved, with some producing erroneous results. I have chosen to upload them as they are because I want this to serve as a personal archive, allowing me to reference my first models as I construct more advanced and accurate models with my newfound knowledge.
Quick note on the PowerPoint: The attached PowerPoint file involved a significant amount of storytelling. My presentation was vocal, transitioning between technical and non-technical jargon related to medicine and mathematics. The slides are not filled with as much information because much of it was presented verbally. I apologize if they appear confusing.
Note: I am not a mathematics student, at the time of writing I am an undergraduate Computer Science student with an interest in Data Science during my leisure time.