This project demonstrates a complete exploratory data analysis (EDA) workflow using the classic Iris dataset. The goal is to show how Python tools can be used to load, inspect, visualize, and engineer features from data, and to draw meaningful insights about the relationships between variables and species.
- Dataset: Iris flower dataset (150 samples, 4 features, 3 species)
- Tools Used: Python, pandas, seaborn, matplotlib
- Notebook: All analysis is documented in
TestDrive.ipynb.
- Data Loading: Imported the Iris dataset using seaborn and loaded it into a pandas DataFrame.
- Data Inspection: Explored the structure, types, and summary statistics of the data.
- Visualization: Created histograms, pairplots, and scatter plots to visualize distributions and relationships.
- Feature Engineering: Created a new feature (Sepal Area) to explore additional relationships.
- Analysis: Compared species using visualizations and statistics to identify which features best separate them.
- Insights: Summarized findings and highlighted the most predictive features for species classification.
- Petal measurements (length and width) are the most effective for distinguishing species, especially Setosa.
- Sepal measurements and engineered features like Sepal Area provide additional, but less powerful, separation.
- Visualizations clearly show patterns and support the conclusions drawn from the data.
- Clone the repository:
git clone https://github.com/KHenn22/datafun-04-eda.git cd datafun-04-eda - Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate - Install dependencies:
pip install -r requirements.txt
- Open
TestDrive.ipynbin Jupyter or VS Code and run the cells to reproduce the analysis.
- Python 3.8 or higher
- See
requirements.txtfor package list
Complete as of 9/10/2025.
- Python 3.8+
- Recommended: Use a virtual environment
- Clone the repository:
git clone https://github.com/KHenn22/datafun-04-eda.git cd datafun-04-eda - Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate - Install dependencies:
pip install -r requirements.txt