Skip to content

sonalgaud12/Quora_QuestionPair

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Quora_QuestionPair

Employed Natural Language Processing techniques such as removal of Stopwords, Punctuations and Hyperlinks to prepare the Dataset(consisting 404290 rows) and also applied techniques such as Tokenization and Stemming • Extracted Basic features and Advance Features consisting of Fuzz features and explored the features importance • Transformed the texts to numerical vectors using TF-IDF Vectorizer and fitted Logistic Regression and Xgboost Model and did Hyperparameter Tuning on Xgboost model to get Auc score of 0.91 and accuracy 83%

About

Employed Natural Language Processing techniques such as removal of Stopwords, Punctuations and Hyperlinks to prepare the Dataset(consisting 404290 rows) and also applied techniques such as Tokenization and Stemming

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors