DeckSpec JSON is the source of truth. The .pptx file is a rendered artifact that can be regenerated after edits.
This project implements a Python-based LangGraph agent that:
- creates a presentation from a natural-language request;
- stores an editable
deck.spec.json; - accepts follow-up edit requests;
- plans visual layouts, callouts, cards, and decorative accents;
- searches licensed image sources through a provider abstraction;
- downloads and caches local image assets for image-based slides;
- re-renders a new
.pptxversion without overwriting previous ones.
The implementation keeps the same core principle: the LLM plans content and visuals, deterministic Python code assembles the .pptx.
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"Copy .env.example to .env and set:
OPENROUTER_API_KEY=
MODEL_NAME=openai/gpt-oss-120b
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
IMAGE_PROVIDER=wikimedia
PEXELS_API_KEY=
UNSPLASH_ACCESS_KEY=
PIXABAY_API_KEY=
RESEARCH_PROVIDER=wikipedia
TAVILY_API_KEY=
PPTX_TEMPLATE_PATH=templates/default_template.pptxIf OPENROUTER_API_KEY is missing, the agent still works with deterministic heuristic fallbacks for create/edit flows.
The scriptwriter sub-agent researches the topic on the web to ground slide content in real facts. RESEARCH_PROVIDER=wikipedia works with no key; set RESEARCH_PROVIDER=tavily and TAVILY_API_KEY=... for full web search (it falls back to Wikipedia if the key is missing).
Interactive stage-by-stage mode (the agent walks you through script → style → visuals → packaging → refinement, pausing for review at each stage):
python -m app.cli chat
# or seed the topic up front:
python -m app.cli chat "Презентация на 8 слайдов про рынок AI-агентов для инвесторов"At each stage you can edit the brief fields (by number or name, e.g. аудитория: инвесторы), rename slides and rewrite bullets, pick a theme, toggle images, and after the first render keep requesting free-form edits ("добавь слайд про риски", "сделай 3 слайд короче") — each edit produces a new version without overwriting the previous .pptx.
To improve an existing presentation instead of starting from scratch, import a .pptx:
python -m app.cli chat --pptx path/to/existing.pptxThe deck is parsed into an editable DeckSpec (titles, bullets, tables, embedded images, and the source theme/colors), then you go straight to the style → visuals → refine stages.
Create a deck:
python -m app.cli create \
"Сделай презентацию на 8 слайдов про рынок AI-агентов для инвесторов" \
--visual-rich \
--theme investor_modern \
--image-provider wikimediaEdit an existing deck:
python -m app.cli edit \
--spec output/<deck_id>/v001/deck.spec.json \
"Сделай презентацию визуально богаче, добавь картинки и карточки"Useful flags:
--visual-rich: prefer richer visual layouts--no-images: disable remote image fetch and rely on decorative fallbacks--image-provider wikimedia|pexels: choose the image source--theme investor_modern|startup_dark|consulting_clean|minimal_blue|corporate_light|tech_gradient--template templates/default_template.pptx: use an optional PowerPoint template
app/deck/models.pyandapp/deck/visual_models.py: Pydantic models forDeckSpec, themes, visual layouts, and edits.app/deck/validator.py: schema-adjacent business validation.app/deck/visual_validator.py: visual quality rules and repair triggers.app/deck/renderer.py: layout-basedPptxAssemblerwith cards, decorations, and image slots.app/deck/patcher.py: safe structured edit application.app/assets/*: image provider abstraction, search, download, and cache metadata.app/design/*: layout constants, themes, decorations, and visual rules.app/storage/files.py: versioned output directories and spec persistence.app/agents/*: LangGraph state, prompts, nodes, and graph assembly.app/cli.py: local create/edit commands.app/api/*: optional FastAPI wrapper.
- Layouts are deterministic by design; this keeps rendering reliable but not free-form.
- External
.pptxinspection recovers titles (from title placeholders), bullets, tables, embedded images, and the source theme colors/fonts; it does not reconstruct original per-shape positioning or animations. - Edit interpretation is strongest for concise Russian requests and can fall back to safe minimal operations.
- Wikimedia image search works without an API key, but query quality still affects relevance and runtime.
- Remote image fetching is sequential right now, so image-rich deck generation can be noticeably slower.
- If no API key is configured, the LLM steps use heuristics instead of model generation.
- richer layout engine;
- PDF and image preview export;
- CSV/Excel-to-chart pipeline;
- stronger edit planning for compound requests;
- approval workflow for destructive edits;
- corporate templates and brand kits.