Skip to content

marktisham/GraphReel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GraphReel Hero

GraphReel

Turn Google Drive files into explainer videos, powered by LangGraph and Vertex AI.

Paste Drive file or folder URLs → the app walks the hierarchy, extracts text, summarizes each file in parallel, then synthesizes everything into:

  • Narrative Briefing — flowing prose designed for reading aloud as a video script
  • Structured Digest — organized headers and bullets for quick reference
  • Explainer Video — short video with AI-generated imagery, TTS audio, and dynamic overlays.

You can also paste in public URLs for additional context.


Overview

GraphReel is a production-ready pipeline that transforms collections of Drive documents into cohesive narratives. It's built on LangGraph (for orchestration), Google Vertex AI (for summarization and content generation), and Streamlit (for the web UI).

The pipeline intelligently:

  1. Resolves Drive URLs into files (recursively traversing folders)
  2. Extracts text from Docs, PDFs, and Markdown
  3. Summarizes each file in parallel using LLMs
  4. Synthesizes summaries into narrative prose and structured bullets
  5. Researches topics mentioned in the narrative (optional web search)
  6. Augments with web findings for completeness
  7. Generates video with AI images, narration, and overlays (optional)


Watch the Demo

Watch this video on youtube see a demo of GraphReel as well as more information about the architecture.

Watch on YouTube


Quick Start

Prerequisites

  • Python 3.11+
  • Google Cloud project with billing enabled
  • gcloud CLI installed and authenticated:
    gcloud auth application-default login
    

1. Enable Google Cloud APIs

In the GCP Console, enable:

  • Google Drive API
  • Google Docs API
  • Vertex AI API
  • Cloud Text-to-Speech API (for video generation)
  • Vertex AI Imagen API (for AI image generation)

2. Create OAuth 2.0 Credentials (For Drive Access)

  1. Go to GCP Console → APIs & Services → Credentials
  2. Click Create Credentials → OAuth 2.0 Client ID
  3. Application type: Desktop app
  4. Name: GraphReel Local
  5. Download JSON and save as credentials.json in the project root

3. Configure OAuth Consent Screen

  1. Go to OAuth consent screen
  2. User type: External
  3. Fill app name and contact email
  4. Add scope: https://www.googleapis.com/auth/drive.readonly
  5. Add yourself as a test user
  6. Save (no need to publish)

4. Configure Environment

cp .env.example .env

Edit .env:

GCP_PROJECT=your-project-id
GCP_LOCATION=us-central1

Note: Always use us-central1 (some Gemini models return 404 from other regions).

5. Install & Run

macOS / Linux:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
python3 -m streamlit run app.py

Windows (PowerShell):

py -3 -m venv .venv
.\.venv\Scripts\Activate.ps1
py -3 -m pip install -r requirements.txt
py -3 -m streamlit run app.py

On first run, click Connect Google Drive, approve access in your browser, and the app auto-refreshes. Credentials are saved to token.json.


Supported File Types

Type Extraction Method
Google Docs Exported as plain text via Drive API
PDF Parsed with pypdf
Plain text / Markdown Downloaded directly

Google Sheets, Slides, and unsupported binary files are skipped with warnings.


Documentation

  • Architecture — Detailed overview of the LangGraph pipeline, state management, and technology stack

Project Structure

GraphReel/
├── app.py              # Streamlit UI (thin layer)
├── pipeline/
│   ├── state.py        # LangGraph TypedDict state definitions
│   ├── prompts.py      # All LLM prompts
│   ├── nodes.py        # LangGraph node functions
│   └── graph.py        # Graph assembly + stream_graph()
├── drive/
│   ├── auth.py         # OAuth flow + token management
│   ├── resolver.py     # URL parsing + recursive folder listing
│   └── extractor.py    # File content extraction
├── credentials.json    # OAuth client secret (git-ignored, you create this)
├── token.json          # OAuth access token (git-ignored, auto-created)
├── .env                # GCP config (git-ignored, you create this)
└── requirements.txt

LangGraph Pipeline

resolve_urls → fetch_files →[Send ×N parallel]→ summarize_file(s) → synthesize → END

Each file is summarized in parallel using LangGraph's Send API (map-reduce pattern). Large files (>6,000 tokens) are automatically chunked before summarization.


Security

  • Credentials are local-only: credentials.json and token.json are git-ignored and never transmitted
  • Read-only access: The app only requests drive.readonly scope — no ability to modify or delete files
  • Token refresh: OAuth tokens auto-refresh; re-authenticate only if you revoke access or delete token.json

License

This project is licensed under the MIT License. See LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

Acknowledgments

About

GraphReel is an open source tool to generate explainer videos from Google Drive folders and linked web content. GraphReel uses LangGraph to orchestrate multi-agent AI workflows for complex content summarization and video generation.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages