Skip to content

nnttvy/BigData-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project scraped YouTube's big data to analyze collected information, aiming to predict video topics and build a content recommendation system.

Data was gathered using YouTube's APIs, including video titles, descriptions, tags, view counts, likes, etc. After preprocessing steps and exploratory data analysis (EDA) to gain initial insights, my team and I employed machine learning and deep learning models, as well as natural language processing (NLP) techniques, to analyze and predict video topics. Specifically, methods such as Count Vectorizer, Word2Vec, Naive Bayes, Linear Regression, and LSTM were utilized to determine the main topic based on the corresponding titles and descriptions.

Finally, we developed a recommendation system using techniques including TF-IDF and Logistic Regression to suggest relevant videos based on user preferences and past behaviors, using collaborative filtering method.

Note: README under revision and will be updated soon.

About

A Big Data and Applications course project within the curriculum of the Data Science specialization at UEH University.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages