This project focuses on the data cleaning and exploration of the EA Sports FIFA 21 player dataset i found on Kaggle. The goal is to preprocess and transform the data, enabling us to identify undervalued players with high potential and performance. This involves cleaning the dataset, normalizing data formats, and making it suitable for in-depth analysis.
- Clean and transform the FIFA 21 dataset to ensure consistency.
- Identify valuable players who are underpaid based on various metrics.
- Handled Missing Values: Replaced missing values in the
Hitscolumn with the mean of non-missing values. - Normalized Numeric Values: Converted string representations of numbers (e.g., "1.6K") to integers for ease of computation.
- Standardized Data Types: Ensured all values in the
Hitscolumn were of integer type.
- Height and Weight Adjustments:
- Converted height measurements from feet/inches to centimeters.
- Converted weight measurements from pounds to kilograms.
- Updated data types for
HeightandWeightto integers, enabling easier analysis.
The project utilizes the following libraries:
pandas: for data manipulation and cleaning.numpy: for numerical operations and transformations.matplotlibandseaborn: for data visualization and insights.
- Clone the Repository: Clone this repository to your local machine.
- Open the Notebook: Launch the
fifa21.ipynbnotebook file in Jupyter Notebook or JupyterLab. - Run the Cells: Execute each cell sequentially to perform data cleaning, transformation, and exploration.
After data processing, the analysis focuses on identifying players with valuable performance attributes who may be underpaid. The Visualizations help identify player valuation.