A collection of machine learning and data science notebooks developed during an internship training program. Covers a wide range of foundational and intermediate ML algorithms, applied to real datasets with hands-on implementations in Python.
| Notebook | Topic |
|---|---|
Linear regression_1 (1).ipynb |
Linear Regression |
Logistic Regression.ipynb |
Logistic Regression |
Decision Trees.ipynb |
Decision Trees |
RandomForest.ipynb |
Random Forest |
K-Nearest Neighbors.ipynb |
K-Nearest Neighbors (KNN) |
K-Means Clustering - Few Examples.ipynb |
K-Means Clustering |
Linear Discriminant Analysis.ipynb |
Linear Discriminant Analysis (LDA) |
Principle Component Analysis - Few Examples.ipynb |
Principal Component Analysis (PCA) |
Face recognition_Data CollectionCode.ipynb |
Face Recognition (Data Collection) |
COVID-19 visualizations.ipynb |
COVID-19 Data Visualization |
DMG-1 Assignment |
Assignment / Exercise |
Supervised Learning
- Linear Regression — predicting continuous outcomes
- Logistic Regression — binary classification
- Decision Trees — interpretable rule-based models
- Random Forest — ensemble learning for improved accuracy
- K-Nearest Neighbors — distance-based classification
Unsupervised Learning
- K-Means Clustering — grouping unlabeled data
- Principal Component Analysis (PCA) — dimensionality reduction
Dimensionality Reduction & Discrimination
- Linear Discriminant Analysis (LDA) — class-separating projections
Computer Vision
- Face Recognition — data collection pipeline using OpenCV
Data Visualization
- COVID-19 Visualizations — trend analysis and charting of pandemic data
- Python 3.x
- Jupyter Notebook
- pandas, numpy
- scikit-learn
- matplotlib, seaborn
- OpenCV (for face recognition)
-
Clone the repository:
git clone https://github.com/janmejoykar1807/Internship_Data_Science.git
-
Install dependencies:
pip install pandas numpy scikit-learn matplotlib seaborn opencv-python jupyter
-
Launch Jupyter Notebook:
jupyter notebook
-
Open any
.ipynbfile to explore the topic.
Janmejoy Kar Data Science learner — applying Python, R, and SQL for data analysis and predictive modeling. GitHub Profile