TaleWeaver

TaleWeaver turns everyday objects into magical stories.

A voice-first interactive storytelling app for kids, powered by Google Gemini Live and Nano Banana 2 🍌.

Children simply pick a storyteller, hold up a toy or draw an idea, and begin a real-time conversation where the AI and child co-create a story together.

Using Gemini Live, the storyteller speaks, listens, adapts to interruptions, and generates illustrations as the adventure unfolds.

Kids don't just listen to a story — they shape it.

▶ Watch the demo

Meet TaleWeaver

The Storytellers

TaleWeaver is powered by Gemini Live's native audio model — which means each character has a real voice, speaks fluently in their own language, and holds a genuine back-and-forth conversation with the child. Not text-to-speech. Not a chatbot. A living storyteller.

5 English storytellers. 5 world-language storytellers.

Note: World-language characters are experimental. Gemini Live narrates fluently in each language, but image generation requires translating the scene description to English internally — which can break the seamlessness of the experience. English storytellers are the primary, fully-tested experience.

Character	Language	Style
Wizard Wally	English	Magical adventures
Fairy Flora	English	Enchanted fairy tales
Captain Coco	English	Pirate adventures
Robo Ricky	English	Sci-fi robot stories
Rajkumari Meera	English (Indian accent)	Indian folk tales
Dadi Maa	Hindi	Traditional bedtime stories
Raja Vikram	Tamil	Legendary Tamil tales
Yé Ye	Mandarin	Wise storytelling
Abuelo Miguel	Spanish	Warm family stories
Mamie Claire	French	Cozy storybook adventures

The Experience

Children start a story in three magical ways:

1. Pick a Theme

Choose from adventure themes or life-skills topics

Or type anything their imagination invents.

If a custom theme isn't appropriate for children, a friendly message blocks it before the story starts.

2. Magic Camera

Hold up any toy or object.

The AI will:

Recognise the object
Transform it into a storybook character
Turn it into the hero of the story

Examples:

Stuffed penguin   → A fluffy penguin pal
LEGO rocket    → Galactic rescue pilot

If the object shown is inappropriate, the safety filter blocks it before the story starts.

3. Sketch a Theme

Kids can draw anything on a canvas.

The AI turns the drawing into a storybook illustration and starts a story around it.

Draw mountains, a house, a robot, a dragon, a castle, a flying whale — and watch it come to life. If the drawing is inappropriate, the safety filter blocks it before the story starts.

A Living, Illustrated Story

Unlike traditional story generators, TaleWeaver is fully conversational. Children can interrupt the storyteller, change the story direction, add characters, and invent new twists at any moment.

AI:    The ant, the ladybug, and the fairy were tired after their long journey...

Child: Make a cloud their bed!

AI:    And just like that, a soft fluffy cloud floated down,
       curling around them like the cosiest bed in the world!

Barge-in is native — Gemini detects when the child starts speaking, stops the current narration, and weaves their words into the next story beat.

As the story unfolds, illustrations appear automatically. Gemini decides the right visual moment — a new location, character reveal, or dramatic transformation — and generates an image from its own scene description, so it always matches what was just narrated. Each new image receives the previous image as context, keeping characters and art style consistent across every scene.

Creativity Rewards

TaleWeaver recognises and celebrates when a child contributes something genuinely imaginative.

When a child suggests a wild idea, invents a new character name, or takes the story in an unexpected direction, Gemini awards them a creativity badge on the spot.

The badge appears in the centre of the screen and auto-dismisses after a few seconds — a small moment of delight that tells the child their imagination matters.

Badges are saved with the story and shown in the Story Recap and Past Adventures gallery.

Story Recap

When the adventure ends, the app generates a storybook recap.

All session images are sent to Gemini, which generates a storybook title and a narration for each scene. Original session images are reused — no new images generated during recap.

Children get a scrollable storybook with title, illustrated scenes, narration captions, and creativity badges. All saved to the Past Adventures gallery.

All completed stories are saved locally and accessible from the landing page. Tap any card to re-read the full storybook.

Key Features

Barge-in voice conversation — Gemini Live listens while it speaks; kids interrupt naturally mid-sentence
Autonomous illustrations — the agent decides the right visual moment and writes its own scene description
Visual continuity — each image is generated with the previous as context; characters stay consistent
Magic Camera — hold up any toy; Gemini names it, draws it, makes it the hero
Sketch-to-story — draw anything on the canvas; it becomes the opening illustration
10 storyteller characters — 5 English + 5 world-language (Hindi, Tamil, Mandarin, Spanish, French), each language-locked
Creativity badges — the agent awards a badge when a child contributes something genuinely imaginative
Story recap storybook — illustrated scenes + narrations generated from the session; saved to Past Adventures
Kid-safe content moderation — themes, camera props, and sketches safety-checked before the story starts
Pause / resume — session and WebSocket stay alive while paused

Architecture

Four Gemini models, each with a distinct role:

Model	Role
`gemini-live-2.5-flash-native-audio`	The Agent — real-time voice, barge-in, autonomous tool calls
`gemini-3.1-flash-image-preview`	Scene illustration generation
`gemini-2.5-flash-lite`	Content moderation + story recap
`gemini-2.5-flash-preview-tts`	Character voice for prop/sketch labels

The Agent

gemini-live-2.5-flash-native-audio drives the entire experience from within a single real-time audio session. It holds two tools and calls them autonomously mid-narration:

generate_illustration — decides the right visual moment, writes its own scene description, triggers image generation. No external timer or trigger.
award_badge — recognises genuine creative contributions from the child and silently awards a badge.

The Backend Proxy

The browser can't connect to Gemini Live directly — Vertex AI requires server-side GCP auth and session orchestration. The FastAPI backend handles authentication, injects character system prompts, routes tool calls, and sends the "Begin!" trigger only after both proxy tasks are running — ensuring no audio is dropped at session start.

Audio Pipeline

Capture: Mic → 16kHz PCM → WebSocket → Gemini Live
Playback: Gemini → 24kHz PCM → precisely scheduled AudioWorklet → speakers

Deployment

Single Cloud Run service — Node 22 builds the React frontend, Python 3.13 serves both API and static files. Push to main → Cloud Build → Artifact Registry → Cloud Run.

For a detailed step-by-step flow of how a session works end-to-end, see HOW_IT_WORKS.md.

Built With

AI Models & APIs

Model	Role
`gemini-live-2.5-flash-native-audio`	The Agent — real-time voice conversation, barge-in, and autonomous tool calls (`generate_illustration`, `award_badge`) via Gemini Live API (Vertex AI)
`gemini-3.1-flash-image-preview`	Storybook illustration generation from scene descriptions
`gemini-2.5-flash-lite`	Content moderation (themes, sketches, camera props) + story recap titles and narrations
`gemini-2.5-flash-preview-tts`	Character TTS — speaks prop/sketch label in the character's voice on theme select

SDKs & Frameworks

SDK / Framework	Usage
Google GenAI Python SDK (`google-genai`)	Image generation, scene extraction, story recap, TTS — all Gemini API calls on the backend
Gemini Live API (Vertex AI WebSocket)	Bidirectional real-time audio streaming for voice storytelling sessions
FastAPI	Python backend — WebSocket proxy, REST endpoints, SPA serving
React 18 + Vite + TypeScript	Frontend SPA
TailwindCSS v3	Styling
Framer Motion	Character and UI animations
Web Audio API + AudioWorklet	16kHz mic capture and 24kHz PCM playback in the browser

Google Cloud Services

Service	Role
Vertex AI	Hosts Gemini Live API and Flash Lite model endpoints
Cloud Run	Single-service deployment — serves both the FastAPI backend and compiled React frontend
Cloud Build	CI/CD — auto-deploys on every push to `main` via `cloudbuild.yaml`
Artifact Registry	Stores Docker container images between builds
Secret Manager	Stores `GEMINI_API_KEY` securely, injected at runtime
Application Default Credentials (ADC)	Authenticates backend to Vertex AI without embedded keys

Languages & Tools

Python 3.13 — backend runtime
Node 22 — frontend build (multi-stage Docker)
Docker — containerisation
WebSockets — browser ↔ backend and backend ↔ Gemini Live bidirectional streaming

Try It Out

Both deployments run on GCP free trial accounts and may be unavailable if the trials have expired:

https://taleweaver.online (also at https://taleweaver-950758825854.us-central1.run.app) — primary (no account, no setup required)
https://taleweaver-319719246868.us-central1.run.app — backup (teammate's account)

Note: If the deployed versions are not working at the time of testing, try running locally.

Open the app on a device with a microphone
Click Begin Your Adventure → pick a storyteller → choose a theme, use Magic Camera, or draw a Sketch
Allow microphone access — the story starts immediately
Speak to redirect the story, interrupt mid-sentence, or suggest ideas

Running Locally

Prerequisites: Python 3.12+, Node.js 18+, uv, a Google Cloud project with Vertex AI enabled, and a Gemini API key.

First time? See QUICKSTART.md for step-by-step GCP setup (account, Vertex AI, gcloud auth).

1. Clone and install

git clone https://github.com/padmanabhan-r/TaleWeaver.git
cd TaleWeaver
uv sync && source .venv/bin/activate

2. Configure environment

cp backend/.env.example backend/.env

Fill in backend/.env:

GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_GENAI_USE_VERTEXAI=true
GEMINI_API_KEY=your-gemini-api-key        # from Google AI Studio
IMAGE_MODEL=gemini-3.1-flash-image-preview
IMAGE_LOCATION=global

3. Authenticate with Google Cloud

gcloud auth application-default login

4. Run

One command (recommended):

./start.sh

Starts backend on port 8000 and frontend on port 8080. Press Ctrl+C to stop both.

Or run separately:

# Terminal 1 — backend
cd backend && uvicorn main:app --reload --port 8000

# Terminal 2 — frontend
cd frontend && npm install && cp .env.example .env.local && npm run dev

	URL
App	http://localhost:8080
Backend	http://localhost:8000
Health	http://localhost:8000/api/health

Deploying to Google Cloud

The app deploys as a single Cloud Run service via Cloud Build CI/CD. For full instructions — Secret Manager setup, Cloud Build trigger, and manual deploy — see DEPLOYMENT_INSTRUCTIONS.md.

Roadmap

The immediate focus is polish — fine-tuning prompts, hardening edge cases, and perfecting the end-to-end experience. World-language characters in particular need more testing and refinement before they're ready for everyday use.

Native iOS and Android apps are the natural next step to put TaleWeaver where children actually are, with proper mobile audio handling and offline resilience.

Feature	Notes
Live Camera in Story Mode	Prototyped but pulled — camera active during narration caused Gemini to break story focus and acknowledge the camera directly. Needs a more seamless integration.
Rive Animated Characters	Replace Framer Motion portraits with Rive state machine animations — real lip-sync tied to audio amplitude. Blocked on Rive asset creation for all 10 characters.
Learning Mode	Storyteller weaves curriculum goals (phonics, counting, colours) into the narrative without the child realising they're learning.
Cloud Storage for Past Adventures	Move from `localStorage` to GCS — images persist across devices and sessions indefinitely, no 20-story cap.
Parent Dashboard	Session summaries, badge history, themes explored — a window into how your child's imagination works.

Name		Name	Last commit message	Last commit date
Latest commit History 253 Commits
backend		backend
frontend		frontend
images		images
miscellaneous		miscellaneous
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT_INSTRUCTIONS.md		DEPLOYMENT_INSTRUCTIONS.md
Dockerfile		Dockerfile
HOW_IT_WORKS.md		HOW_IT_WORKS.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
SECURITY.md		SECURITY.md
architecture-v1.gif		architecture-v1.gif
architecture-v1.svg		architecture-v1.svg
cloudbuild.yaml		cloudbuild.yaml
pyproject.toml		pyproject.toml
start.sh		start.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TaleWeaver

Contents

Meet TaleWeaver

The Storytellers

The Experience

1. Pick a Theme

2. Magic Camera

3. Sketch a Theme

A Living, Illustrated Story

Creativity Rewards

Story Recap

Key Features

Architecture

The Agent

The Backend Proxy

Audio Pipeline

Deployment

Built With

AI Models & APIs

SDKs & Frameworks

Google Cloud Services

Languages & Tools

Try It Out

Running Locally

1. Clone and install

2. Configure environment

3. Authenticate with Google Cloud

4. Run

Deploying to Google Cloud

Roadmap

About

Uh oh!

Releases 2

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TaleWeaver

Contents

Meet TaleWeaver

The Storytellers

The Experience

1. Pick a Theme

2. Magic Camera

3. Sketch a Theme

A Living, Illustrated Story

Creativity Rewards

Story Recap

Key Features

Architecture

The Agent

The Backend Proxy

Audio Pipeline

Deployment

Built With

AI Models & APIs

SDKs & Frameworks

Google Cloud Services

Languages & Tools

Try It Out

Running Locally

1. Clone and install

2. Configure environment

3. Authenticate with Google Cloud

4. Run

Deploying to Google Cloud

Roadmap

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors

Uh oh!

Languages