Skip to content

Mariam-iftikhar/R-Coding-And-Markdown-Files

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

12 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“Š R Programming & Statistical Analysis Portfolio

R RMarkdown RStudio Tidyverse

GitHub LinkedIn Email


๐Ÿ“ Overview

Welcome to my R Programming & Statistical Analysis Portfolio! This repository showcases my expertise in R programming, R Markdown, and statistical modeling through a comprehensive collection of academic and professional projects. Each project demonstrates my proficiency in data manipulation, statistical analysis, data visualization, and reproducible research practices using R.

As a Business Analytics graduate student at Roosevelt University, these projects reflect my strong foundation in statistical methods, predictive analytics, and data-driven decision-makingโ€”essential skills for Data Analyst and Business Intelligence roles.

๐ŸŽฏ Repository Highlights

  • 6 Comprehensive R Projects covering statistical analysis and predictive analytics
  • R Markdown Documents for reproducible, professional analysis reports
  • Curated R Cheatsheets (data.table, xts) for quick reference and learning
  • Real-World Datasets including Houston flights data and statistical modeling scenarios
  • Academic Excellence demonstrating strong statistical theory and practical application
  • Clean, Documented Code following R best practices and coding standards

๐Ÿ“ Projects

โœˆ๏ธ 1. Houston Flights Dataset Analysis

Folder: Houston_Flights_Dataset_Analysis_Assignment

Description:
Comprehensive analysis of Houston airport flight data examining patterns, delays, and operational insights. This project demonstrates data wrangling, exploratory analysis, and visualization skills using real aviation datasets.

Key Learnings:

  • Working with large-scale time series flight data
  • Analyzing flight delay patterns and causes
  • Identifying peak travel times and seasonal trends
  • Data cleaning and preprocessing for aviation datasets
  • Creating insightful visualizations for operational insights
  • Statistical testing for flight performance metrics

Technologies: R, R Markdown, dplyr, ggplot2, lubridate, tidyr

Skills Demonstrated: Data Wrangling, Time Series Analysis, Exploratory Data Analysis, Data Visualization, Statistical Testing


๐ŸŽฏ 2. Sampling and Group Analysis

Folder: Sampling_and_Group_Analysis_Assignment

Description:
Statistical analysis project focusing on sampling techniques, group comparisons, and hypothesis testing. This assignment demonstrates fundamental statistical concepts and their practical applications.

Key Learnings:

  • Understanding various sampling methods (random, stratified, systematic)
  • Conducting group comparisons using t-tests and ANOVA
  • Hypothesis testing and p-value interpretation
  • Confidence interval construction and interpretation
  • Sample size determination and power analysis
  • Statistical significance vs. practical significance

Technologies: R, R Markdown, Statistical Testing Packages

Skills Demonstrated: Statistical Inference, Hypothesis Testing, Sampling Theory, Group Analysis, Research Methodology


๐Ÿ”ฎ 3. Predictive Analytics Assignment 1

Folder: Predictive_Analytics_Assignment_1

Description:
Introduction to predictive modeling focusing on linear regression analysis, model evaluation, and interpretation. This project builds the foundation for advanced predictive analytics techniques.

Key Learnings:

  • Building and interpreting linear regression models
  • Model assumptions testing (normality, homoscedasticity, linearity)
  • Feature selection and variable importance
  • Model evaluation using R-squared, RMSE, and MAE
  • Residual analysis and diagnostics
  • Making predictions and confidence intervals

Technologies: R, R Markdown, Statistical Modeling, caret

Skills Demonstrated: Linear Regression, Model Diagnostics, Predictive Modeling, Statistical Inference


๐Ÿ“ˆ 4. Predictive Analytics Assignment 2

Folder: Predictive_Analytics_Assignment_2

Description:
Advanced regression techniques including multiple regression, polynomial regression, and regularization methods. This project expands predictive modeling skills with more complex scenarios.

Key Learnings:

  • Multiple linear regression with multiple predictors
  • Handling multicollinearity (VIF analysis)
  • Polynomial and interaction terms
  • Ridge and Lasso regularization
  • Cross-validation techniques
  • Feature engineering and transformation

Technologies: R, R Markdown, glmnet, caret, Statistical Modeling

Skills Demonstrated: Multiple Regression, Regularization, Feature Engineering, Model Selection, Cross-Validation


๐Ÿค– 5. Predictive Analytics Assignment 3

Folder: Predictive_Analytics_Assignment_3

Description:
Classification modeling project focusing on logistic regression and model evaluation for categorical outcomes. This assignment demonstrates skills in binary and multinomial classification.

Key Learnings:

  • Logistic regression for binary classification
  • Odds ratios and probability interpretation
  • Classification metrics (accuracy, precision, recall, F1-score)
  • ROC curves and AUC analysis
  • Confusion matrix interpretation
  • Threshold optimization for classification

Technologies: R, R Markdown, Logistic Regression, pROC, caret

Skills Demonstrated: Classification Modeling, Logistic Regression, Model Evaluation, ROC Analysis, Probability Modeling


๐ŸŽฏ 6. Predictive Analytics Assignment 4

Folder: Predictive_Analytics_Assignment_4

Description:
Advanced machine learning techniques including decision trees, ensemble methods, and model comparison. This project showcases expertise in modern predictive analytics approaches.

Key Learnings:

  • Decision trees and tree-based models
  • Random forests and ensemble methods
  • Model comparison and selection strategies
  • Handling imbalanced datasets
  • Feature importance from tree-based models
  • Advanced model evaluation techniques

Technologies: R, R Markdown, randomForest, rpart, caret, Machine Learning

Skills Demonstrated: Tree-Based Models, Ensemble Learning, Random Forests, Model Comparison, Advanced ML Techniques


๐Ÿ“š R Cheatsheets & Resources

This repository includes carefully curated reference materials to support R programming workflow:

data_table_cheat_sheet.pdf

Comprehensive guide to the data.table package for high-performance data manipulation

  • Fast data aggregation and summarization
  • Efficient joins and reshaping operations
  • Memory-efficient data processing
  • Advanced data.table syntax and operations

xts_Cheat_Sheet.pdf

Extensible Time Series (xts) package reference for time series data manipulation

  • Time series object creation and manipulation
  • Date/time indexing and subsetting
  • Time-based aggregations
  • Time series plotting and analysis

Purpose: These cheatsheets serve as quick references to enhance workflow efficiency and support continuous learning in R programming, benefiting both learners and professionals.


๐Ÿ› ๏ธ Technical Skills Demonstrated

Programming & Tools

  • R Programming: Advanced R syntax, functions, packages, and best practices
  • R Markdown: Reproducible research, literate programming, professional reports
  • RStudio: Integrated development environment proficiency
  • Version Control: Git/GitHub for code management and collaboration

Statistical Analysis

  • Descriptive Statistics: Summary statistics, distributions, data exploration
  • Inferential Statistics: Hypothesis testing, confidence intervals, p-values
  • Regression Analysis: Linear, multiple, polynomial, logistic regression
  • Predictive Modeling: Machine learning algorithms, model evaluation, validation
  • Time Series Analysis: Temporal patterns, seasonality, trend analysis
  • Statistical Testing: t-tests, ANOVA, chi-square, correlation tests

Data Manipulation & Visualization

  • Data Wrangling: dplyr, tidyr, data.table for efficient data manipulation
  • Data Visualization: ggplot2 for professional, publication-quality graphics
  • Data Cleaning: Handling missing values, outliers, data quality issues
  • Feature Engineering: Creating derived variables, transformations, encoding

Predictive Analytics & Machine Learning

  • Supervised Learning: Regression and classification models
  • Model Evaluation: Cross-validation, performance metrics, ROC analysis
  • Regularization: Ridge, Lasso for overfitting prevention
  • Ensemble Methods: Random forests, boosting, bagging
  • Feature Selection: Variable importance, stepwise selection

Research & Documentation

  • Reproducible Research: R Markdown for transparent, repeatable analysis
  • Technical Writing: Clear documentation, code comments, professional reports
  • Statistical Reporting: Communicating findings to technical and non-technical audiences
  • Data Storytelling: Creating narratives from statistical insights

๐Ÿ’ผ Business Impact

These R programming projects demonstrate my ability to:

โœ… Conduct Rigorous Statistical Analysis: Apply appropriate statistical methods to answer business questions
โœ… Build Predictive Models: Create accurate models to forecast outcomes and support decision-making
โœ… Extract Insights from Data: Transform raw data into actionable intelligence through statistical analysis
โœ… Ensure Reproducibility: Document analysis workflows for transparency and repeatability
โœ… Communicate Complex Results: Present statistical findings clearly to diverse stakeholders
โœ… Apply Academic Excellence: Demonstrate strong theoretical foundation combined with practical skills


๐Ÿ”— Why R Programming?

R is the gold standard for statistical computing and data science, offering:

  • Comprehensive Statistical Capabilities: Industry-leading statistical packages and methods
  • Data Visualization Excellence: ggplot2 and other libraries for stunning, informative graphics
  • Reproducible Research: R Markdown enables transparent, repeatable analysis workflows
  • Active Community: Extensive package ecosystem (CRAN) with 18,000+ packages
  • Academic & Industry Adoption: Widely used in research, healthcare, finance, and tech
  • Open Source: Free, community-driven, continuously evolving language

๐Ÿ“‚ Related Portfolios

Explore my other professional work:


๐Ÿ’ช Core Competencies

  • Programming: R (Advanced), Python, SQL
  • Statistical Analysis: Regression, Hypothesis Testing, Predictive Modeling, Time Series
  • Data Science: Machine Learning, Statistical Modeling, Data Mining
  • Business Intelligence: Power BI, Data Visualization, Dashboard Development
  • Tools: RStudio, Jupyter Notebooks, Git/GitHub, Excel
  • Business Skills: Analytical Thinking, Problem-solving, Communication, Research

๐Ÿ“ซ Contact

I'm always open to connecting with fellow data professionals, discussing opportunities, or collaborating on interesting projects!


โš–๏ธ License

This repository is for educational and portfolio purposes. The code and analysis are available for learning and reference. Please feel free to explore and reach out with any questions!


๐Ÿ“š Additional Resources

Interested in learning R? Here are some valuable resources:

  • R for Data Science by Hadley Wickham & Garrett Grolemund
  • CRAN: The Comprehensive R Archive Network
  • RStudio Cheatsheets: Quick references for popular R packages
  • R-bloggers: Community-driven R news and tutorials

โญ If you find these R projects helpful or interesting, please consider starring this repository!

Last Updated: December 2025

About

A collection of R projects showcasing my skills in data analysis, visualization, and statistical modeling. These projects highlight my ability to clean, analyze, and interpret data while applying efficient coding practices.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors