diff --git a/docs/cycles/cycle-17-developer-page.md b/docs/cycles/cycle-17-developer-page.md new file mode 100644 index 0000000..1361858 --- /dev/null +++ b/docs/cycles/cycle-17-developer-page.md @@ -0,0 +1,116 @@ +# Cycle 17 β€” developer documentation pages + +## Goal +Fill the five `docs/developers/*` stubs at their **canonical paths** (no renames) with accurate, +verified content so a competent developer can understand the architecture, embed vivify in their own +app, and contribute. The README (the "For developers" line) and the docs landing +(`docs/README.md`) already link to all five, so the links resolve today but lead to "🚧 Coming soon" +placeholders. This cycle makes them real. + +The **audience flips** here. Every other docs page targets the zero-assumptions nostalgia visitor; +these target a developer. Per the parked `vision-and-docs-spec.md`, the tone stays warm, +second-person, present-tense, and quietly funny β€” but technical density is fine, and any nostalgia +wink lives only in a clearly-set-off aside that's deletable without losing instruction. The spec's +`developers/*` map is the page contract: + +- `overview.md` β€” the openness pitch: framework-agnostic, embed in anything (the heavy lift). +- `quickstart.md` β€” install β†’ ~10-line embed, copy-paste, browser fallback voice by default. +- `api.md` β€” engine API reference. +- `providers.md` β€” writing a custom `TtsProvider`; fallback vs authentic. +- `bundles.md` β€” `acs2bundle`; hosting characters on a CDN. + +**Docs only β€” no code; CI stays green.** + +## Verified facts the pages use (no guessing β€” cross-checked against the repo) + +**Packages / data flow** (link, don't duplicate, [`docs/architecture.md`](../architecture.md)): +- `@vivify/types` β€” shared contracts: `CharacterModel` (the superset IR), `TtsProvider`, `TtsResult`, + `MouthEvent` (`packages/types/src/index.ts`). +- `@vivify/core` β€” framework-agnostic engine; load β†’ render/composite β†’ animate β†’ speak. +- `@vivify/acs` β€” `.acs` parser + `acs2bundle` CLI; same module in Node and the browser. +- `@vivify/voice-truvoice` β€” `TruVoiceProvider` (authentic) + `WebSpeechProvider` (browser fallback). +- `services/voice-server` (`@vivify/voice-server`) β€” Dockerized Wine + SAPI4 + TruVoice; `POST /tts`. +- `apps/mash` (package name `mash`) β€” the showcase/dogfood, built only on the public API. + +**Public API** (`@vivify/core`, verified β€” `packages/core/src/{agent,types,index}.ts`): +- `createAgent(source: ArrayBuffer | CharacterBundleRef, mount?, opts?): Promise` +- `createAgentFromModel(model: CharacterModel, mount?, opts?): Agent` +- `CreateAgentOptions { clock?, provider?, rng?, audio? }` β€” **`provider` defaults to the silent + `StubTtsProvider`.** +- `Agent { show(); hide(); play(name); animations(): string[]; speak(text, opts?); moveTo(x, y, opts?); + gestureAt(x, y); stopCurrent(); stop(); on(event, handler); dispose() }` β€” every action enqueues + and runs in order. +- `SpeakOptions { hold?, provider? }`, `MoveOptions { speed? }`, `AgentEvent` union, + `CharacterBundleRef { manifestUrl }`. Also exported: `ActionQueue`, `Playback`, `WebAudioSink`, + `StubTtsProvider`, lip-sync helpers; the `@vivify/types` contracts are re-exported. + +**TtsProvider seam** (`packages/types/src/index.ts`): +- `interface TtsProvider { speak(text, voice: VoiceConfig, signal?: AbortSignal): Promise }` +- `TtsResult { audio: ArrayBuffer; mouthTimeline: MouthEvent[] }` +- Implementations: `StubTtsProvider` (silent default, in core), `WebSpeechProvider` (browser fallback, + in voice-truvoice), `TruVoiceProvider` (authentic, POSTs `${url}/tts`). + +**Commands (only these exist β€” verified against `package.json`):** +- Node `>=20`, `pnpm@9.15.0`; workspaces `packages/*`, `services/*`, `apps/*`. +- Root: `pnpm -r typecheck`, `pnpm -r test`, `pnpm lint` (`eslint .`), `pnpm format` + (`prettier --check .`), `pnpm format:write`. +- MASH dev: `pnpm --filter mash dev` (Vite). +- `acs2bundle`: bin in `@vivify/acs` β†’ `pnpm --filter @vivify/acs exec acs2bundle ` + (usage string in `cli.ts`: `acs2bundle `). +- Voice server (Node): `pnpm --filter @vivify/voice-server build && … start`. Docker: `docker compose + up` (compose service `voice` :8080, `mash` :8090). + +## Honesty call β€” packages are not published to npm (handled, not papered over) +Every `@vivify/*` package is `"private": true` at `"version": "0.0.0"` β€” **none are on npm**. The +spec's "`npm i` β†’ talking character" is the destination, not today's reality. Per +[ADR-0025](../decisions/0025-repo-front-door.md) ("ship only what's real, signpost the rest, no +polish-theater") and the CLAUDE.md honesty rule, `quickstart.md`: +- leads with the **working** path today β€” clone, `pnpm install`, use the workspace packages / run MASH + (`apps/mash/src/app.ts` is itself a live, ~10-line use of the public API); +- shows the canonical embed snippet (real, correct API) as the shape consumers will use; +- frames the `npm i …` line as **"once published,"** not a command that works today. + +The other four pages have no such gap β€” the API, types, CLI, and providers all exist and are verified. + +## Pages written +- `docs/developers/overview.md` (heavy lift): what it is architecturally (packages table + data-flow, + linking `architecture.md`); the runs-anywhere pitch (framework-agnostic core, the `TtsProvider` + seam, browser voice needs nothing) with the real public-API list; get-running-fast from source; how + to extend; contributing/conventions (summarized from CLAUDE.md). One set-off "screenshots coming" + note; nav footer. +- `docs/developers/quickstart.md`: honest install path β†’ the ~10-line embed using + `WebSpeechProvider` for an audible zero-backend default (with the silent-default caveat explained). +- `docs/developers/api.md`: the verified surface, signatures copied from source; action-queue + semantics (`stopCurrent` vs `stop`). +- `docs/developers/providers.md`: the `TtsProvider` interface + `TtsResult`/`MouthEvent`; the three + implementations; how to write your own; fallback vs authentic. +- `docs/developers/bundles.md`: what a bundle is (`sheet.png` + `manifest.json` + `audio/`), the + `acs2bundle` invocation, raw-`.acs`-in-browser vs ahead-of-time bundle, hosting via + `CharacterBundleRef { manifestUrl }`. + +**ADRs referenced:** 0001 (monorepo), 0003 (superset IR/bundle), 0004 (voice needs a backend), 0005 +(pluggable provider), 0006 (MIT, no bundled IP), 0007 (framework-agnostic core), 0012 (core depends on +acs for the raw-`.acs` path), 0013 (MASH vanilla TS). + +## Acceptance check +- All five `docs/developers/*` pages contain real content (no "🚧 Coming soon" placeholder) and read + in the developer voice. +- Every documented API signature matches `packages/core/src/{agent,types}.ts` / + `packages/types/src/index.ts`; every command exists in the relevant `package.json`. +- No `npm i` line is presented as working today; the not-yet-published status is stated plainly. +- Every relative link in the new pages resolves (architecture, ADRs, CLAUDE.md, sibling dev pages, + `voice/overview.md`, docs home, main README). + +## Verification +- **CI (this repo):** `pnpm -r typecheck && pnpm -r test && pnpm lint && pnpm format` stays green. + Docs only, and Markdown is prettier-ignored, so the change touches no code path. +- **Reviewer (`code-reviewer`):** verifies every signature against source, every command against + `package.json`, every link resolves, and the not-published framing is honest β€” trusting nothing from + the draft. +- **Operator/PO:** read `docs/developers/overview.md` on GitHub for voice + correctness; follow + `quickstart.md` against a local clone. + +## Non-goals +Screenshots / GIFs (next cycle β€” signposted, not faked). No code changes. No package publishing / no +flipping `private: false`. No edits to `architecture.md`, the ADRs, or `CLAUDE.md` (link only). No +merge β€” open a PR (base `main`) and stop. diff --git a/docs/developers/api.md b/docs/developers/api.md index 9861af9..23a1aa7 100644 --- a/docs/developers/api.md +++ b/docs/developers/api.md @@ -1,14 +1,149 @@ # Engine API reference -> 🚧 **Coming soon.** This page lands in **Cycle 16**. It's a placeholder for now, so links -> pointing here already work β€” no dead ends. +The public surface of `@vivify/core`. Signatures here are copied from the source +(`packages/core/src/agent.ts` and `types.ts`); if you find drift, the source wins β€” please flag it. -**What it'll cover:** the full `@vivify/core` API β€” every method on the Agent control. +New here? The **[quickstart](quickstart.md)** shows this API in action in about ten lines. -In the meantime: +## Creating an agent -- New to all of this? Start with **[What is this?](../what-is-this.md)**. -- Want to try it right now? See the **[main README](../../README.md)**. +```ts +function createAgent( + source: ArrayBuffer | CharacterBundleRef, + mount?: HTMLElement, + opts?: CreateAgentOptions, +): Promise; + +function createAgentFromModel( + model: CharacterModel, + mount?: HTMLElement, + opts?: CreateAgentOptions, +): Agent; +``` + +- **`createAgent`** is the usual entry point. `source` is either a raw `.acs` as an `ArrayBuffer` or a + `CharacterBundleRef` pointing at a prebuilt bundle (see **[bundles](bundles.md)**). It loads the + character, then mounts. Resolves once the character is ready. +- **`createAgentFromModel`** skips the loader when you already have a decoded `CharacterModel` (for + example, straight from `@vivify/acs`'s `parseAcs`). `createAgent` calls this for you after loading. +- **`mount`** is the host element the engine renders into. Omit it and the engine falls back to + `document.body`. + +### `CreateAgentOptions` + +```ts +interface CreateAgentOptions { + clock?: Clock; // Injectable clock (default: real timers). + provider?: TtsProvider; // Default TTS provider (default: silent StubTtsProvider). + rng?: Rng; // RNG for branch/state selection (default: Math.random). + audio?: AudioSink; // Audio playback sink (default: Web Audio; tests inject a fake). +} +``` + +The default `provider` is the **silent** `StubTtsProvider` β€” pass a real one (e.g. +`WebSpeechProvider` or `TruVoiceProvider`) for audible speech. See **[TTS providers](providers.md)**. + +### `CharacterBundleRef` + +```ts +interface CharacterBundleRef { + manifestUrl: string; // URL of the bundle's manifest.json. +} +``` + +## The `Agent` control + +Returned by `createAgent` / `createAgentFromModel`. It mirrors the classic Microsoft Agent control: +**every action enqueues and runs in order.** + +```ts +interface Agent { + show(): Promise; + hide(): Promise; + play(animationName: string): Promise; + animations(): string[]; // Names of the character's available animations. + speak(text: string, opts?: SpeakOptions): Promise; + moveTo(x: number, y: number, opts?: MoveOptions): Promise; + gestureAt(x: number, y: number): Promise; + stopCurrent(): void; // Drop the currently-running queued action. + stop(): void; // Clear the queue and return to idle. + on(event: AgentEvent, handler: (...a: unknown[]) => void): void; + dispose(): void; +} +``` + +### The action queue + +The methods that return `Promise` (`show`, `hide`, `play`, `speak`, `moveTo`, `gestureAt`) +**enqueue** work. Calls play strictly in order, so you can fire a sequence without chaining promises +yourself: + +```ts +agent.play('Wave'); // runs first +agent.speak('Hi there!'); // then this +agent.moveTo(300, 200); // then this +``` + +`await` a returned promise when you need to know a specific action finished. Two ways to interrupt: + +- **`stopCurrent()`** drops only the action running right now; the rest of the queue continues. +- **`stop()`** clears the whole queue and returns the character to idle. + +`animations()` returns the available animation names (synchronously) β€” handy for building a UI or for +validating a name before you `play` it. `dispose()` tears the agent down and releases its host +element; call it when you're done (e.g. a React effect cleanup). + +### `SpeakOptions` + +```ts +interface SpeakOptions { + hold?: boolean; // Keep the balloon up after speaking instead of auto-hiding. + provider?: TtsProvider; // Override the TTS provider for this one utterance. +} +``` + +`provider` here overrides the agent's default provider for a single `speak` call β€” useful to mix +silent and authentic speech, or to point one line at a different voice. + +### `MoveOptions` + +```ts +interface MoveOptions { + speed?: number; // Movement speed (pixels/second); the engine picks a default if omitted. +} +``` + +### Events + +```ts +type AgentEvent = + | 'show' + | 'hide' + | 'play' + | 'speak' + | 'move' + | 'gesture' + | 'idle' + | 'command' + | 'error'; +``` + +Register handlers with `agent.on(event, handler)` to react as actions run (for example, re-enabling a +button on `'idle'`, or surfacing an `'error'`). + +## Also exported + +`@vivify/core` re-exports the shared contracts from `@vivify/types` for convenience +(`CharacterModel`, `VoiceConfig`, `TtsProvider`, `TtsResult`, `MouthEvent`, and the model types), and +exposes the engine building blocks for advanced use: `StubTtsProvider`, `ActionQueue`, `Playback`, +`WebAudioSink`, and the lip-sync helpers. The everyday path needs none of these β€” `createAgent` and the +`Agent` methods above are the whole story. + +## Where to next + +- **[TTS providers](providers.md)** β€” the `TtsProvider` seam in detail. +- **[Character bundles](bundles.md)** β€” what a `CharacterBundleRef` points at, and how to build one. +- **[Quickstart](quickstart.md)** β€” the API in a runnable snippet. --- diff --git a/docs/developers/bundles.md b/docs/developers/bundles.md index 6c19d97..1ed3d76 100644 --- a/docs/developers/bundles.md +++ b/docs/developers/bundles.md @@ -1,14 +1,87 @@ # Character bundles -> 🚧 **Coming soon.** This page lands in **Cycle 16**. It's a placeholder for now, so links -> pointing here already work β€” no dead ends. +A character can run two ways. The browser path is the simplest: hand `@vivify/core` a raw `.acs` as an +`ArrayBuffer` and it parses and plays it in memory ([ADR-0012](../decisions/0012-core-depends-on-acs.md)). +But parsing a `.acs` on every page load means shipping the whole binary and decoding it client-side. A +**bundle** is the ahead-of-time alternative: convert once, serve static web assets, load fast β€” and +host them on a CDN. -**What it'll cover:** the `acs2bundle` CLI and hosting prebuilt characters on a CDN. +## What a bundle is -In the meantime: +Run the `acs2bundle` CLI (from `@vivify/acs`) and you get a folder with three things: -- New to all of this? Start with **[What is this?](../what-is-this.md)**. -- Want to try it right now? See the **[main README](../../README.md)**. +``` +/ + sheet.png a packed, transparent sprite-sheet of every unique image in the character + manifest.json the full CharacterModel minus the pixels and WAVs (zod-validated) + audio/000.wav each embedded sound, extracted (only present if the character has sounds) + audio/001.wav + ... +``` + +The `manifest.json` is the serialized superset IR β€” animations, frame branching, balloon and voice +config, the mouth-overlay data, the state map, plus an atlas mapping each sprite to its rectangle in +`sheet.png`. It's the same `CharacterModel` the in-browser parser produces, just split from its binary +assets so the browser can fetch them as ordinary files. (Why a superset and not a clippy-compatible +format? [ADR-0003](../decisions/0003-superset-bundle-format.md).) + +## Converting a `.acs` + +`acs2bundle` is the CLI shipped by `@vivify/acs`. From the repo: + +```bash +pnpm --filter @vivify/acs exec acs2bundle +``` + +For example: + +```bash +pnpm --filter @vivify/acs exec acs2bundle ./Genie.acs ./public/characters/genie +``` + +It prints a summary (image / animation / sound counts and the sheet dimensions) and writes the three +outputs into ``. It's Node-only β€” it reads the `.acs` from disk and writes PNG/JSON/WAV files β€” +so it's an ahead-of-time build step, not something you run in the browser. + +> You supply the `.acs` file. vivify never ships character files or engine binaries β€” they're +> gitignored, and you bring your own. See **[Legal & assets](../legal-and-assets.md)**. + +## Hosting on a CDN + +Serve the `` as static files (any web host or CDN), then point the engine at the manifest with +a `CharacterBundleRef`: + +```ts +import { createAgent } from '@vivify/core'; + +const agent = await createAgent( + { manifestUrl: 'https://cdn.example.com/characters/genie/manifest.json' }, + document.getElementById('stage')!, +); +await agent.show(); +``` + +`createAgent` accepts **either** a raw `.acs` `ArrayBuffer` **or** a `CharacterBundleRef` β€” the rest of +the API is identical from there. The engine fetches the manifest, sprite sheet, and audio from the URL +you gave it. Keep the `sheet.png` and `audio/` files next to the `manifest.json` (the manifest +references them by relative name), and you're serving characters straight from the edge. + +## Raw `.acs` vs bundle β€” which to use + +| | Raw `.acs` `ArrayBuffer` | Prebuilt bundle (`CharacterBundleRef`) | +| --- | --- | --- | +| Setup | none β€” just fetch the file | one `acs2bundle` build step | +| Client work | parses the binary in the browser | fetches ready-made static assets | +| Best for | quick experiments, user-uploaded files | production, CDN delivery, many page loads | + +Both go through the exact same engine and the exact same `Agent` API β€” the only difference is where the +parsing happens. + +## Where to next + +- **[API reference](api.md)** β€” `createAgent` and `CharacterBundleRef`. +- **[Quickstart](quickstart.md)** β€” the raw-`.acs` path in a runnable snippet. +- **[Architecture](../architecture.md)** β€” where the bundle sits in the data flow. --- diff --git a/docs/developers/overview.md b/docs/developers/overview.md index 56025bf..162c624 100644 --- a/docs/developers/overview.md +++ b/docs/developers/overview.md @@ -1,14 +1,152 @@ # For developers β€” overview -> 🚧 **Coming soon.** This page lands in **Cycle 16**. It's a placeholder for now, so links -> pointing here already work β€” no dead ends. +So you want to put a talking, gesturing Microsoft Agent character into your own app. Good news: that's +exactly what vivify is built for. The character engine is a plain TypeScript library with **no +framework dependency** β€” drop it into React, Vue, Svelte, or a single `