Skip to content

usr-wwelsh/Research-Digest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Python License arXiv Platform

research-digest status research-digest uptime research-digest cpu research-digest ram research-digest cpu trend research-digest ram trend

πŸ“š Research Digest

Automated daily research paper digest from arXiv with smart filtering, mobile-friendly interface, and AI-powered summaries.

Fetch, filter, and browse the latest research papers tailored to your interests. Desktop grid view for deep reading, mobile feed for quick scrolling.


✨ Features

  • 🎯 Smart Filtering - Keyword-based relevance scoring across custom research interests
  • πŸ“± Mobile Feed - Swipeable, full-screen card interface optimized for phones
  • πŸ–₯️ Desktop Grid - Multi-column layout with rich metadata and difficulty badges
  • 🧠 AI Summaries - Auto-generated layman explanations using transformers
  • πŸ”„ Deduplication - Never see the same paper twice with built-in tracking
  • βš™οΈ Configurable - JSON-based settings for interests, filters, and preferences
  • πŸ“¦ Archive - Auto-saves daily digests with browsable index

πŸ–ΌοΈ Screenshots

Desktop View

Desktop Demo

Mobile Feed

Mobile Demo


πŸš€ Quick Start

Windows

  1. Clone & Run

    git clone https://github.com/usr-wwelsh/research-digest.git
    cd research-digest
    run_digest.bat
  2. First run automatically:

    • Creates virtual environment
    • Installs dependencies
    • Fetches papers
    • Generates HTML digests
  3. Open in browser:

    • latest.html - Most recent digest
    • index.html - Browse all archives
    • tiktok_feed.html - Mobile-optimized feed

Linux/macOS

git clone https://github.com/usr-wwelsh/research-digest.git
cd research-digest
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
python main.py
python generate_index.py

βš™οΈ Configuration

Edit config.json to customize:

{
  "interests": {
    "Your Research Area": {
      "query": "cat:cs.LG OR cat:cs.AI",
      "keywords": ["keyword1", "keyword2", "keyword3"]
    }
  },
  "settings": {
    "papers_per_interest": 10,
    "recent_days": 7,
    "summary_max_length": 160
  }
}

Available Settings

Setting Default Description
papers_per_interest 10 Papers to fetch per category
recent_days 7 Look back window (0 = all time)
fallback_days 90 Extended search if few results
summary_max_length 160 Max characters for summaries
fetch_multiplier 5 Over-fetch for better filtering

πŸ“– arXiv Query Syntax

Use arXiv category codes in queries:

  • cat:cs.LG - Machine Learning
  • cat:cs.CV - Computer Vision
  • cat:cs.CL - Computation & Language (NLP)
  • cat:cs.AI - Artificial Intelligence
  • cat:cs.CR - Cryptography & Security
  • cat:cs.DC - Distributed Computing

Combine with OR/AND: cat:cs.LG OR cat:cs.AI

Full category list


πŸ”§ Advanced Usage

Proxmox LXC Deployment (One-Liner)

Want a self-hosted, always-on instance with Cloudflare Tunnel?

From your Proxmox host:

bash <(curl -sL https://raw.githubusercontent.com/usr-wwelsh/Research-Digest/main/create-lxc.sh)

This automatically:

  • Creates a Debian 12 LXC container (4GB RAM, 4 cores, 20GB disk)
  • Installs Python, Caddy web server, and cloudflared
  • Sets up the venv and all dependencies
  • Configures a weekly cron job (Monday 8am) with CPU/memory limits
  • Starts Caddy to serve digests on port 8080

After the script finishes:

  1. Enter the container: pct enter <CTID>
  2. Edit /opt/research-digest/config.json with your research interests
  3. Set up Cloudflare Tunnel to expose it publicly
  4. Run a test: /opt/research-digest/run.sh

Idle footprint is ~50-80MB RAM (Caddy + cloudflared). The weekly digest run spikes to ~4GB briefly for torch inference, then drops back down.

Automated Daily Digests & Mobile Sync

Want automatic daily updates synced to your phone? See the πŸ“± Complete Setup Guide for:

  • Windows Task Scheduler configuration
  • Linux/macOS cron jobs
  • Syncthing mobile sync setup
  • Troubleshooting tips

Reset Seen Papers

python reset_seen_papers.py

πŸ“‚ Project Structure

research-digest/
β”œβ”€β”€ config.json              # Configuration (edit this!)
β”œβ”€β”€ main.py                  # Core paper fetcher
β”œβ”€β”€ generate_index.py        # Archive browser generator
β”œβ”€β”€ generate_tiktok_feed.py  # Mobile feed generator
β”œβ”€β”€ run_digest.bat           # Windows launcher
β”œβ”€β”€ run.sh                   # Linux pipeline runner (used by cron)
β”œβ”€β”€ requirements.txt         # Python dependencies
β”œβ”€β”€ create-lxc.sh            # Proxmox LXC creator (run on host)
β”œβ”€β”€ setup.sh                 # Container bootstrap script
β”œβ”€β”€ Caddyfile                # Caddy web server config
β”œβ”€β”€ research-digest-caddy.service  # Systemd unit for Caddy
β”œβ”€β”€ latest.html              # Latest digest (auto-generated)
β”œβ”€β”€ index.html               # Archive browser (auto-generated)
β”œβ”€β”€ tiktok_feed.html         # Mobile feed (auto-generated)
β”œβ”€β”€ seen_papers.json         # Deduplication tracker
└── arxiv_archive/           # Daily archives
    β”œβ”€β”€ arxiv_digest_20251101.html
    └── ...

πŸ› οΈ Requirements

  • Python 3.8+
  • Dependencies: transformers, torch, requests
  • Disk Space: ~2GB for model, ~10MB per digest
  • Internet: Required for arXiv API and first-time model download

πŸ“ License

MIT License - see LICENSE file for details


🀝 Contributing

Contributions welcome! Ideas:

  • Additional paper sources (bioRxiv, SSRN, etc.)
  • Browser extension for direct syncing
  • Custom ML models for better summaries
  • Export to Notion/Obsidian/Roam

πŸ™ Acknowledgments

  • arXiv for the open research repository
  • Hugging Face for transformer models
  • Inspired by modern feed UIs and research workflows

Built with ❀️ for researchers who want to stay current without drowning in papers

About

Auto-curated arXiv paper digest with AI summaries and mobile-optimized feed. No API keys, no tracking, no cloud

Topics

Resources

License

Stars

Watchers

Forks

Contributors