Skip to content

Waheed-6907/CodeAlpha_Web_Scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraping Project - CodeAlpha Internship

📌 Objective

This project focuses on extracting data from multiple websites using Python and performing basic data analysis.

🌐 Websites Used

  1. http://quotes.toscrape.com
  2. https://realpython.github.io/fake-jobs/

🛠️ Technologies Used

  • Python
  • BeautifulSoup
  • Requests
  • Pandas

🔍 Features

  • Scraped quotes (text and author) from Quotes website
  • Scraped job data (title, company, location) from Fake Jobs website
  • Stored data in structured format using pandas
  • Exported datasets to Excel files
  • Performed filtering and basic analysis

📊 Analysis Performed

  • Filtered jobs based on location
  • Filtered jobs based on domain (Python, Teaching)
  • Found most common job locations
  • Basic exploration of quotes dataset

📁 Output Files

  • fake_jobs.xlsx → job dataset
  • quotes.xlsx → quotes dataset

🚀 How to Run

  1. Install required libraries:

    pip install requests beautifulsoup4 pandas
    
  2. Run the script:

    python scraper1.py
    python scraper2.py
    

📌 Author

Waheed Mujtaba

About

Web scraping and basic data analysis using Python (CodeAlpha Internship Task)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages