Skip to content

RedScraperPro: a Powerful OpenSource Cross-Platform & Feature-Rich (CLI) Tool Designed For Crawling & Scraping Real Time Data and Any Data From Reddit all Automated with Multiple Formats Exports and Sentiment Analysis & SWOT Features. Built with Coffee & Python.

Notifications You must be signed in to change notification settings

yomazini/RedScraperPro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ”ด RedScraperPro

โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— 
โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ–ˆโ–ˆโ•—
โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘   โ–ˆโ–ˆโ•‘
โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•  โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ•šโ•โ•โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ•  โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘   โ–ˆโ–ˆโ•‘
โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•
โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ•โ•โ•โ•โ•โ• โ•šโ•โ•โ•โ•โ•โ•โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•     โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•     โ•šโ•โ•  โ•šโ•โ• โ•šโ•โ•โ•โ•โ•โ• 
                                                                                                           
    ๐Ÿฉธ The Ultimate Reddit Scraping CLI Tool ๐Ÿฉธ
    "In the darkness of data, we find the light of knowledge"

โš ๏ธ EDUCATIONAL PURPOSE ONLY
This tool is designed for educational purposes, research, and legitimate data analysis only. The author is not responsible for any misuse. Please ensure you comply with Reddit's Terms of Service, API guidelines, and respect rate limits. Always use this tool responsibly and ethically.


logo

๐ŸŽฏ Features

Core Scraping Capabilities

  • โœ… Posts & Comments Scraping - Extract both posts and their comments
  • โœ… Multiple Scraping Modes:
    • ๐Ÿ” Keyword-based scraping
    • ๐Ÿ˜๏ธ Subreddit scraping
    • ๐Ÿ‘ค User profile scraping
    • ๐Ÿ“ Individual post scraping
  • โœ… Real-time Scraping - Live data extraction as it happens
  • โœ… Resume Interrupted Scraping - Continue where you left off
  • โœ… Configurable Depth Limits - Control how deep to scrape
  • โœ… Native Command-Line Access - Run the tool from anywhere on your system with simple aliases like rsp or redscraperpro.
  • โœ… Robust & Standardized Installation - Uses modern Python packaging to create reliable, cross-platform launchers for Windows, macOS, and Linux.

Export & Data Management

  • ๐Ÿ“Š Multiple Export Formats: CSV, XLSX, JSON, TXT
  • ๐Ÿงน Duplicate Detection & Removal - Clean your data automatically
  • ๐Ÿ“ˆ Optional Sentiment Analysis - Lightweight sentiment scoring
  • ๐Ÿ“‹ Comprehensive Data Fields - Title, author, score, timestamp, awards, and more

User Experience

  • ๐ŸŽจ Beautiful ASCII Art - Horror/Itachi Uchiha themed interface
  • ๐Ÿ“Š Real-time Progress Tracking - See your scraping progress live
  • ๐Ÿ”ง Configuration Wizard - Easy first-time setup
  • ๐Ÿ“ฑ Cross-platform - Works on Windows, macOS, and Linux
  • ๐Ÿ†˜ Comprehensive Help System - Built-in documentation and examples
  • ๐Ÿ’ญ Inspirational Quotes - Stoic, Kafka, Dostoevsky, and Itachi-themed quotes

๐Ÿš€ The RedOcean Ecosystem

RedScraperPro is a core component of the RedOcean Ecosystem, a suite of tools designed to provide an end-to-end workflow for market intelligence, from data collection to strategic action. All-In-One

Tool Purpose Status
๐Ÿ”ด RedScraperPro Data Collection โœ… Live
๐Ÿ”ต RedOceanRadar Strategic Analysis ๐Ÿงช Beta (Not Stable) "Next Month"
โšซ RedNexusPro Contact & Lead Generation โœ… Live "Next Month"
๐ŸŸก CryptoSleuth Cryptocurrency Intelligence โœ… Live "Next Month"

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.8 or higher
  • Reddit API credentials (PRAW)

Installation

A simple installation script is provided to set up the tool and its dependencies.

# Clone the repository
git clone https://github.com/yomazini/RedScraperPro.git
cd RedScraperPro

# Run the installation script
chmod +x install.sh && ./install.sh

source venv/bin/activate  

After installation, you can run the tool using redscraperpro, rsp, or python3 src/redscraperpro/main.py.

Screenshoot


Getting Reddit API Credentials

๐Ÿ“– Detailed Guide: How to Get PRAW API Credentials

Quick steps:

  1. Go to Reddit Apps
  2. Click "Create App" or "Create Another App"
  3. Choose "script" as the app type
  4. Note down your client_id, client_secret, and set your user_agent. You can find your user agent here: What is my User Agent?

๐ŸŽฎ Usage

Interactive Mode

# Run using any of the aliases
rsp
# or
redscraperpro

Command Line Mode

# Scrape by keyword
rsp --mode keyword --query "python programming" --limit 100

# Scrape subreddit
redscraperpro --mode subreddit --target "programming" --limit 50

# Scrape user posts
rsp --mode user --target "username" --limit 25

# Export to different formats
rsp --mode keyword --query "AI" --export xlsx --output "ai_posts"

Configuration

The first time you run the tool, a configuration wizard will launch to help you set up:

  • Reddit API credentials
  • Default export settings
  • Scraping preferences
  • Output directories

๐Ÿ“ Project Structure

.
โ”œโ”€โ”€ RedScraperPro/
โ”‚   โ”œโ”€โ”€ assets/
โ”‚   โ”‚   โ”œโ”€โ”€ ascii_art.txt
โ”‚   โ”‚   โ””โ”€โ”€ quotes.json
โ”‚   โ”œโ”€โ”€ config/
โ”‚   โ”‚   โ””โ”€โ”€ readme.md
โ”‚   โ”œโ”€โ”€ docs/
โ”‚   โ”‚   โ”œโ”€โ”€ installation.md
โ”‚   โ”‚   โ”œโ”€โ”€ praw-setup.md
โ”‚   โ”‚   โ”œโ”€โ”€ sentiment_analysis.md
โ”‚   โ”‚   โ”œโ”€โ”€ troubleshooting.md
โ”‚   โ”‚   โ””โ”€โ”€ usage-examples.md
โ”‚   โ”œโ”€โ”€ examples/
โ”‚   โ”‚   โ””โ”€โ”€ basic_scraping.py
โ”‚   โ”œโ”€โ”€ exports/
โ”‚   โ”‚   โ””โ”€โ”€ README.md
โ”‚   โ”œโ”€โ”€ logs/
โ”‚   โ”‚   โ””โ”€โ”€ README.md
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ””โ”€โ”€ redscraperpro/
โ”‚   โ”‚       โ”œโ”€โ”€ cli/
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ interface.py
โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ wizard.py
โ”‚   โ”‚       โ”œโ”€โ”€ exporters/
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ csv_exporter.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ json_exporter.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ txt_exporter.py
โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ xlsx_exporter.py
โ”‚   โ”‚       โ”œโ”€โ”€ scraper/
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ comment_scraper.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ post_scraper.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ reddit_scraper.py
โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ user_scraper.py
โ”‚   โ”‚       โ”œโ”€โ”€ utils/
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ ascii_art.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ config.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ logger.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ progress.py
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ quotes.py
โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ sentiment.py
โ”‚   โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ””โ”€โ”€ main.py
โ”‚   โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ DOCUMENTATION.md
โ”‚   โ”œโ”€โ”€ FINAL_SUMMARY.md
โ”‚   โ”œโ”€โ”€ LICENSE
โ”‚   โ”œโ”€โ”€ NOTICE
โ”‚   โ”œโ”€โ”€ PROJECT_SUMMARY.md
โ”‚   โ”œโ”€โ”€ install.sh
โ”‚   โ”œโ”€โ”€ notes_after_somefixed.md
โ”‚   โ”œโ”€โ”€ requirements.txt
โ”‚   โ””โ”€โ”€ setup.py
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ logo.png

๐Ÿ’ก Best Usage / Monetization "MUST READ"

This is a detailed article on the best usage and monetization strategies for this tool RedScraperPro:


๐ŸŽจ Themes & Aesthetics

RedScraperPro features a unique Horror/Itachi Uchiha aesthetic with:

  • ๐Ÿ”ด Red color scheme throughout the interface
  • ๐Ÿฉธ Dark, mysterious ASCII art
  • ๐Ÿ’ญ Philosophical quotes from Stoic philosophers, Kafka, Dostoevsky
  • โšก Itachi Uchiha-inspired themes and quotes
  • ๐ŸŒ™ Dark terminal-friendly design

๐Ÿ“Š Data Fields Extracted

Posts

  • Title, Author, Score (upvotes/downvotes)
  • Creation timestamp, URL, Flair
  • Number of comments, Awards
  • Subreddit, Post ID, Permalink
  • Content/Selftext, Media URLs
  • Optional: Sentiment score

Comments

  • Comment body, Author, Score
  • Creation timestamp, Comment ID
  • Parent comment ID, Depth level
  • Awards, Controversiality
  • Optional: Sentiment score

๐Ÿ”ง Configuration Options

  • API Credentials: Reddit API setup
  • Export Settings: Default formats and locations
  • Scraping Limits: Posts/comments per session
  • Sentiment Analysis: Enable/disable sentiment scoring
  • Logging Level: Control verbosity
  • Theme Settings: ASCII art and color preferences
  • Resume Settings: Auto-save progress for resuming

๐Ÿšจ Rate Limiting & Best Practices

  • Respect Reddit's API limits - Tool provides warnings but doesn't enforce limits
  • Use reasonable delays between requests
  • Monitor your API usage through Reddit's developer dashboard
  • Be respectful of communities and users
  • Follow Reddit's ToS and community guidelines

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments

  • Reddit API (PRAW) - For providing excellent API access
  • Stoic Philosophers - For timeless wisdom
  • Franz Kafka - For existential insights
  • Fyodor Dostoevsky - For psychological depth
  • Itachi Uchiha - For the aesthetic inspiration

logo

๐Ÿฉธ"Those who cannot acknowledge themselves will eventually fail." - Itachi Uchiha๐Ÿฉธ

RedScraperPro acknowledges itself as the ultimate Reddit scraping tool, and therefore, it will never fail. How?, With Your Support ๐ŸŒŸ.


๐Ÿ“ž Support & Contact


"In the world of data, we are all just shadows seeking light." - RedScraperPro Philosophy

About

RedScraperPro: a Powerful OpenSource Cross-Platform & Feature-Rich (CLI) Tool Designed For Crawling & Scraping Real Time Data and Any Data From Reddit all Automated with Multiple Formats Exports and Sentiment Analysis & SWOT Features. Built with Coffee & Python.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published