📚 Scholar — Goal-Driven AI Study System

Upload your own textbooks, set a real deadline, and let Scholar build a personalized study plan — with AI notes, grounded chat, quizzes, and adaptive learning that are answered only from your books.

Scholar turns a pile of PDFs into a structured, deadline-aware course. Every note, every chat answer, and every quiz question is grounded in — and limited to — the specific sources you provide. No hallucinated facts from the open internet.

✨ What Scholar Does

A student uploads their course material, says "I want to master this by the 30th," and Scholar:

Indexes the books into a dual retrieval engine (structural + semantic).
Builds a study plan — a sequence of sessions sized to the deadline.
Teaches each session with streamed, cited notes and a grounded Q&A chat.
Tests understanding with quizzes, and adapts when the student struggles.
Certifies mastery with a cumulative final test that marks the goal complete.

🎯 Capabilities

Capability	What it means for the student
Bring your own books	Upload PDFs or paste a URL — extraction and indexing are automatic.
Goal-driven plans	Set a topic, level, and deadline; Scholar generates a session-by-session plan.
AI study notes	Each session opens with concise, streamed markdown notes — every claim cited to a page.
Grounded chat	Ask anything; answers come only from your books, with source citations.
Quizzes + adaptive learning	Auto-generated quizzes; a low score inserts a targeted remedial session automatically.
Final cumulative test	A cross-session exam that, when passed, marks the whole goal complete.
Super Agent	One chat that reasons across all your indexed books at once.
Multimodal (optional)	Opt-in understanding of diagrams and figures via a vision model.
Notion export	Push your plan and notes to a Notion page in one click.
Model-agnostic	Runs on OpenAI, Anthropic, Google, or any OpenAI-compatible model — all via `.env`.

🔍 How It Works

flowchart LR
    A[📄 Upload books<br/>PDF / URL] --> B[Dual indexing<br/>PageIndex tree + vector embeddings]
    B --> C[🎯 Set a goal<br/>topic · level · deadline]
    C --> D[🗂️ Study plan<br/>N sessions]
    D --> E[📝 Notes + 💬 Chat + ❓ Quiz<br/>grounded & cited]
    E -->|low score| F[➕ Adaptive<br/>remedial session]
    E --> G[🏁 Final test<br/>→ goal complete]
    B --> H[🤖 Super Agent<br/>chat across all books]

At the core is a dual retrieval engine with two complementary strategies, selectable per query or pinned via config:

PageIndex (structural) — an LLM navigates a hierarchical tree of the document and returns whole sections.
Vector (semantic) — cosine similarity over embeddings returns the most relevant chunks.
Hybrid — runs both and merges them.

→ Design details in docs/retrieval.md.

📊 Does it actually work?

The retrieval strategies are benchmarked head-to-head with RAGAS on a 30-question golden set, using a bundled Environmental Science textbook as the evaluation book.

Strategy	Faithfulness	Answer Relevancy	Context Precision
PageIndex	0.94	0.81	0.77
Vector	0.85	0.91	0.92
Hybrid	0.94	0.90	0.90

A real trade-off — and Hybrid gets the best of both: it keeps PageIndex's grounding (faithfulness 0.94) while nearly matching Vector's precision (0.90). Full methodology, the evaluation book, and the complete comparison: docs/evaluation.md.

📖 Documentation

Start here → docs/README.md — the documentation hub.

Doc	Contents
Architecture	Components, agent layer, storage model, request map
Flows	Diagrams: ingestion, retrieval, sessions, adaptive, final test, super agent
Retrieval	The dual engine — PageIndex vs vector, router, hybrid merge
Evaluation	RAGAS methodology, the eval book, results, reproduction
Configuration	Full environment reference — running model-agnostically + selecting a strategy

🧱 Tech at a Glance

Backend FastAPI · LangChain/LangGraph · PostgreSQL + pgvector · PageIndex (open-source, local) · RAGAS Frontend Next.js 14 · TypeScript · Tailwind · native SSE streaming Models Provider-agnostic via a model factory — OpenAI / Anthropic / Google / any OpenAI-compatible endpoint

A goal-driven study system with adaptive learning, multimodal ingestion, a cumulative final test, a cross-book Super Agent, and Notion export — backed by a benchmarked dual retrieval engine.

Name		Name	Last commit message	Last commit date
Latest commit History 199 Commits
backend		backend
docs		docs
eval		eval
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
compose.yml		compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Scholar — Goal-Driven AI Study System

✨ What Scholar Does

🎯 Capabilities

🔍 How It Works

📊 Does it actually work?

📖 Documentation

🧱 Tech at a Glance

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 Scholar — Goal-Driven AI Study System

✨ What Scholar Does

🎯 Capabilities

🔍 How It Works

📊 Does it actually work?

📖 Documentation

🧱 Tech at a Glance

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages