Turn Google Drive files into explainer videos, powered by LangGraph and Vertex AI.
Paste Drive file or folder URLs → the app walks the hierarchy, extracts text, summarizes each file in parallel, then synthesizes everything into:
- Narrative Briefing — flowing prose designed for reading aloud as a video script
- Structured Digest — organized headers and bullets for quick reference
- Explainer Video — short video with AI-generated imagery, TTS audio, and dynamic overlays.
You can also paste in public URLs for additional context.
GraphReel is a production-ready pipeline that transforms collections of Drive documents into cohesive narratives. It's built on LangGraph (for orchestration), Google Vertex AI (for summarization and content generation), and Streamlit (for the web UI).
The pipeline intelligently:
- Resolves Drive URLs into files (recursively traversing folders)
- Extracts text from Docs, PDFs, and Markdown
- Summarizes each file in parallel using LLMs
- Synthesizes summaries into narrative prose and structured bullets
- Researches topics mentioned in the narrative (optional web search)
- Augments with web findings for completeness
- Generates video with AI images, narration, and overlays (optional)
Watch this video on youtube see a demo of GraphReel as well as more information about the architecture.
- Python 3.11+
- Google Cloud project with billing enabled
- gcloud CLI installed and authenticated:
gcloud auth application-default login
In the GCP Console, enable:
- Google Drive API
- Google Docs API
- Vertex AI API
- Cloud Text-to-Speech API (for video generation)
- Vertex AI Imagen API (for AI image generation)
- Go to GCP Console → APIs & Services → Credentials
- Click Create Credentials → OAuth 2.0 Client ID
- Application type: Desktop app
- Name:
GraphReel Local - Download JSON and save as
credentials.jsonin the project root
- Go to OAuth consent screen
- User type: External
- Fill app name and contact email
- Add scope:
https://www.googleapis.com/auth/drive.readonly - Add yourself as a test user
- Save (no need to publish)
cp .env.example .env
Edit .env:
GCP_PROJECT=your-project-id
GCP_LOCATION=us-central1
Note: Always use us-central1 (some Gemini models return 404 from other regions).
macOS / Linux:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
python3 -m streamlit run app.py
Windows (PowerShell):
py -3 -m venv .venv
.\.venv\Scripts\Activate.ps1
py -3 -m pip install -r requirements.txt
py -3 -m streamlit run app.py
On first run, click Connect Google Drive, approve access in your browser, and the app auto-refreshes. Credentials are saved to token.json.
| Type | Extraction Method |
|---|---|
| Google Docs | Exported as plain text via Drive API |
| Parsed with pypdf | |
| Plain text / Markdown | Downloaded directly |
Google Sheets, Slides, and unsupported binary files are skipped with warnings.
- Architecture — Detailed overview of the LangGraph pipeline, state management, and technology stack
GraphReel/
├── app.py # Streamlit UI (thin layer)
├── pipeline/
│ ├── state.py # LangGraph TypedDict state definitions
│ ├── prompts.py # All LLM prompts
│ ├── nodes.py # LangGraph node functions
│ └── graph.py # Graph assembly + stream_graph()
├── drive/
│ ├── auth.py # OAuth flow + token management
│ ├── resolver.py # URL parsing + recursive folder listing
│ └── extractor.py # File content extraction
├── credentials.json # OAuth client secret (git-ignored, you create this)
├── token.json # OAuth access token (git-ignored, auto-created)
├── .env # GCP config (git-ignored, you create this)
└── requirements.txt
resolve_urls → fetch_files →[Send ×N parallel]→ summarize_file(s) → synthesize → END
Each file is summarized in parallel using LangGraph's Send API (map-reduce pattern). Large files (>6,000 tokens) are automatically chunked before summarization.
- Credentials are local-only:
credentials.jsonandtoken.jsonare git-ignored and never transmitted - Read-only access: The app only requests
drive.readonlyscope — no ability to modify or delete files - Token refresh: OAuth tokens auto-refresh; re-authenticate only if you revoke access or delete
token.json
This project is licensed under the MIT License. See LICENSE for details.
Contributions are welcome! Please feel free to submit issues and pull requests.
- Built with LangGraph for orchestration
- Powered by Google Vertex AI for LLMs and image generation
- UI built with Streamlit
- Video generation with moviepy and Pillow



