Skip to content

LEON-gittech/PhdBooster

Repository files navigation

PhdBooster πŸŽ“πŸš€

Better breaks = Better research

You focus on changing the world. We'll find you the good videos. ✨

OpenClaw HuggingFace Platform License

English | δΈ­ζ–‡


πŸ€” What is this?

PhD life is stressful. You open Douyin or Xiaohongshu to relax, but all you get is ads and news?

PhdBooster is an AI-powered browsing assistant built on OpenClaw β€” it scrolls through Douyin / Xiaohongshu while you write papers, uses vision models to actually "see" every video, and automatically likes & bookmarks content that matches your taste, training the recommendation algorithm to serve you better.

You're writing a paper πŸ“
  β†’ PhdBooster is browsing videos for you πŸ“±
    β†’ AI "sees" each video πŸ‘οΈ
      β†’ Matches your taste? Auto like & bookmark ❀️
        β†’ Platform algorithm learns your preferences 🧠
          β†’ You put down the paper, open your phone β€” feed is perfect 😏

The more you use, the more you save (break time, that is)


πŸ’‘ Why PhdBooster?

  • πŸ”₯ Just got roasted in the group meeting β€” need some eye candy to recover
  • πŸ’₯ Experiment crashed β€” need something nice to calm down
  • πŸ“„ Paper rejected β€” need a morale boost
  • πŸŒ™ Pulling an all-nighter before the deadline β€” need fuel to survive

But your feed is full of ads, news, and paid courses... your 5-minute break is wasted πŸ˜‘

PhdBooster's philosophy: 10 minutes of quality break > 30 minutes of junk scrolling. Time saved = more papers read = PhD boosted 🀷


πŸ—οΈ Architecture

graph TD
    A[🌐 Open Browser] --> B[πŸ”„ Main Loop]
    B --> C[πŸ“Š Single Evaluate β€” Extract Metadata]
    C --> D{πŸŽ™οΈ Livestream?}
    D -->|Yes| E[⏭️ Skip]
    D -->|No| F[πŸ“ Text Pre-filter]
    F -->|Non-target| E
    F -->|Potential match| G[πŸ“Έ Screenshot]
    G --> H[🧠 Vision Model Analysis]
    H --> I{❀️ Matches Preference?}
    I -->|Yes| J[πŸ‘ Like + Verify]
    I -->|No| E
    E --> K[⬇️ Scroll to Next]
    K --> B
    J --> K
    B --> L{🎯 Target Count Reached?}
    L -->|Yes| M[βœ… Done]
    L -->|No| B
Loading

Two-stage filtering funnel β€” accurate and efficient:

  1. Text quick-filter 🏷️ β€” Parse title, hashtags, and author info. Skip videos containing non-target keywords (gaming, sports, news, etc.), saving ~60% of screenshot overhead.
  2. Visual deep-filter πŸ‘οΈ β€” For potential matches, take a screenshot and send it to the vision model with your preference policy for analysis.

πŸ“± Live Demo

Β Β Β Β 

Left: AI analyzes a video and decides "non-target β†’ skip" | Right: AI compares screenshots of different videos

The AI provides detailed descriptions of video content, then gives a judgment with reasoning based on your preference policy β€” not blind liking, but genuinely thinking before acting 🧠


πŸ› οΈ Tech Stack

Browser Automation

Built on OpenClaw Browser (Chrome CDP under the hood). Real browser operations, behavior identical to a human user. A single evaluate call extracts all page metadata (author, title, hashtags, like button position) to minimize CDP round-trips.

Primary LLM

Uses step-3.5-flash:free on OpenRouter as the main reasoning model πŸš€ β€” blazing fast token generation. Grab it while it's still free!

Since this model doesn't support multimodal input, visual analysis is delegated to dedicated vision models via tool calls πŸ‘‡

Vision Models (Dual Fallback)

Primary β€” Kimi 2.5 (Moonshot AI)

  • Endpoint: https://api.moonshot.cn/v1/chat/completions
  • Model: kimi-k2.5
  • Pros: Fast, high quality, excellent policy comprehension

Fallback β€” Local Ollama

  • Model: Based on LEONW24/Qwen3.5-9B-Uncensored πŸ€—
  • Pros: Completely free, works offline, no content filtering (you know why 😏)
  • Uploaded to HuggingFace, ready to use out of the box

Fallback logic: Kimi first β†’ on failure / timeout / parse error β†’ auto-switch to Ollama.

Preference Policy

Your "aesthetic standards" are defined in a simple Markdown file (edge_policy.md). Edit and it takes effect instantly β€” zero code changes. Support multiple policy files and switch anytime πŸ”„


πŸš€ Quick Start

Prerequisites

1. Start OpenClaw

openclaw status
openclaw gateway start   # if not running

2. Open Target Platform

# Douyin
openclaw browser open https://www.douyin.com/?recommend=1

# Xiaohongshu
openclaw browser open https://www.xiaohongshu.com/explore

3. Configure Preference Policy

Tell OpenClaw to generate a preference policy *_policy.md β€” define what you like and dislike:

# Preference Policy

**Target content (like if any match):**
- Type A you like
- Type B you like
- ...

**Non-target content (skip):**
- Gaming, sports, news
- ...

**Threshold: Loose**

4. Start Browsing

openclaw skill douyin-edge browse --target-count 20

Go back to your paper β˜• β€” 20 likes will be done by the time you return.


πŸ“ Project Structure

PhdBooster/
β”œβ”€β”€ πŸ“‚ assets/
β”‚   β”œβ”€β”€ logo.png                   # Project logo (Dr. Lobster)
β”‚   β”œβ”€β”€ banner.png                 # Brand banner
β”‚   β”œβ”€β”€ cabian_definition.png      # Edge content definition reference
β”‚   β”œβ”€β”€ demo-analysis.jpg          # Demo: AI analysis result
β”‚   └── demo-screenshot.jpg        # Demo: visual comparison
β”œβ”€β”€ πŸ“„ edge_policy.md              # Preference policy β€” Chinese
β”œβ”€β”€ πŸ“„ edge_policy_en.md           # Preference policy β€” English
β”œβ”€β”€ πŸ“„ douoyin_edge_workflow.md    # Detailed workflow β€” Chinese
β”œβ”€β”€ πŸ“„ douyin_edge_workflow_en.md  # Detailed workflow β€” English
β”œβ”€β”€ 🐍 analyze_edge.py             # Vision analysis wrapper (Kimi β†’ Ollama fallback)
β”œβ”€β”€ πŸ”§ kimi_query.py               # Kimi 2.5 API client
β”œβ”€β”€ πŸ”§ ollama_query.py             # Ollama API client
β”œβ”€β”€ πŸ“‹ SKILL.md                    # OpenClaw skill definition
β”œβ”€β”€ πŸ“‹ TOOLS.md                    # Tools reference
β”œβ”€β”€ πŸ“‹ README.md                   # This file (English)
└── πŸ“‹ README_CN.md                # Chinese version

πŸ“š Documentation


⚠️ Known Limitations & Roadmap

Current pain points:

  • 🐒 OpenClaw throughput is slow β€” even with step-3.5-flash, speed is mediocre
  • ⏱️ OpenClaw Browser CDP operations have ~10x latency vs native (snapshot / screenshot / evaluate average 10s each)

Future plans:

  • πŸ”Œ Playwright direct connection β€” bypass OpenClaw Gateway for much lower latency
  • ⚑ Parallel vision analysis β€” queue multiple screenshots and send concurrently
  • 🧠 Adaptive scroll waiting β€” detect new content loading instead of fixed delays
  • πŸ“Š Policy self-learning β€” collect false positive/negative feedback to auto-optimize preferences
  • 🌐 More platforms (Bilibili, Weibo, ...)

πŸ€— Open Source Model

The fallback vision model is uploaded to HuggingFace:

πŸ‘‰ LEONW24/Qwen3.5-9B-Uncensored

An uncensored version of Qwen3.5-9B with content safety restrictions removed, suitable for this project's visual analysis tasks. Deploy with Ollama for completely local, free, unlimited usage.


πŸ“œ Disclaimer

This project is for educational and research purposes only. Using automation tools on third-party platforms may violate their Terms of Service. Please assess risks on your own β€” use responsibly.


PhdBooster πŸŽ“πŸš€

Make every break count

Made with ❀️ by a stressed PhD student

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages