tree decision model by EJARA WORKU by ejara42 · Pull Request #46 · softwareWCU/Data-Preprocessing-for-ML-using-Titanic-Dataset

ejara42 · 2025-11-26T19:17:12Z

EDA: shows structure, missing values, basic stats so you understand the dataset.

Target selection: automatically chooses a target column, but you can override.

Preprocessing: numeric imputation (median), categorical imputation + one-hot encoding.

Train/test split: hold out 20% for final evaluation; stratify if classification.

Baseline model: Decision Tree pipeline trained with default params.

Evaluation: accuracy/MSE, confusion matrix, classification report.

Cross-validation: quick 5-fold check.

Hyperparameter search: GridSearchCV to improve the tree.

Visualization: plot top levels of the tree and print important features.

Save: persist model and evaluation to model_artifacts, and optionally copy to Drive

ejara42 added 4 commits November 20, 2025 13:22

Created using Colab

3176260

Created using Colab

cbb2b58

Created using Colab

66ea7d4

Created using Colab

9a32d25

Provide feedback