Skip to content

Binz120/Rarchive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rarchive

A Python tool to archive posts from any public subreddit using Reddit's API via PRAW.

Python Version License


Features

  • Download Reddit posts with rich metadata
  • Organised JSON output with timestamps
  • Environment-based configuration
  • Modular architecture

Installation

git clone https://github.com/Binz120/rarchive.git
cd rarchive
pip install -r requirements.txt

Configuration

  1. Create a Reddit App:

  2. Copy .env.example to .env and fill in:

    REDDIT_CLIENT_ID=your_client_id
    REDDIT_CLIENT_SECRET=your_client_secret
    REDDIT_USER_AGENT=script:rarchive:v2.0 (by /u/yourusername)

Usage

python -m src

Or run directly:

python src/__main__.py

Output

Posts are saved to the output/ directory as JSON files:

{
  "subreddit": "python",
  "sort_type": "new",
  "fetched_at": "2024-01-15T10:30:00",
  "post_count": 100,
  "posts": [...]
}

Project Structure

src/
├── __init__.py      # Package init
├── __main__.py      # Entry point
├── config.py        # Configuration management
├── reddit_client.py # Reddit API wrapper
├── fetcher.py       # Post fetching logic
├── formatter.py     # JSON output formatting
└── models.py        # Data classes

API Compliance

  • Respects Reddit's rate limits (60 req/min)
  • Do not collect personal/private data
  • Follow Reddit API Terms

About

A Python script to archive posts from any public subreddit using Reddit's API via PRAW.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages