Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 116 additions & 0 deletions docs/cycles/cycle-17-developer-page.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Cycle 17 — developer documentation pages

## Goal
Fill the five `docs/developers/*` stubs at their **canonical paths** (no renames) with accurate,
verified content so a competent developer can understand the architecture, embed vivify in their own
app, and contribute. The README (the "For developers" line) and the docs landing
(`docs/README.md`) already link to all five, so the links resolve today but lead to "🚧 Coming soon"
placeholders. This cycle makes them real.

The **audience flips** here. Every other docs page targets the zero-assumptions nostalgia visitor;
these target a developer. Per the parked `vision-and-docs-spec.md`, the tone stays warm,
second-person, present-tense, and quietly funny — but technical density is fine, and any nostalgia
wink lives only in a clearly-set-off aside that's deletable without losing instruction. The spec's
`developers/*` map is the page contract:

- `overview.md` — the openness pitch: framework-agnostic, embed in anything (the heavy lift).
- `quickstart.md` — install → ~10-line embed, copy-paste, browser fallback voice by default.
- `api.md` — engine API reference.
- `providers.md` — writing a custom `TtsProvider`; fallback vs authentic.
- `bundles.md` — `acs2bundle`; hosting characters on a CDN.

**Docs only — no code; CI stays green.**

## Verified facts the pages use (no guessing — cross-checked against the repo)

**Packages / data flow** (link, don't duplicate, [`docs/architecture.md`](../architecture.md)):
- `@vivify/types` — shared contracts: `CharacterModel` (the superset IR), `TtsProvider`, `TtsResult`,
`MouthEvent` (`packages/types/src/index.ts`).
- `@vivify/core` — framework-agnostic engine; load → render/composite → animate → speak.
- `@vivify/acs` — `.acs` parser + `acs2bundle` CLI; same module in Node and the browser.
- `@vivify/voice-truvoice` — `TruVoiceProvider` (authentic) + `WebSpeechProvider` (browser fallback).
- `services/voice-server` (`@vivify/voice-server`) — Dockerized Wine + SAPI4 + TruVoice; `POST /tts`.
- `apps/mash` (package name `mash`) — the showcase/dogfood, built only on the public API.

**Public API** (`@vivify/core`, verified — `packages/core/src/{agent,types,index}.ts`):
- `createAgent(source: ArrayBuffer | CharacterBundleRef, mount?, opts?): Promise<Agent>`
- `createAgentFromModel(model: CharacterModel, mount?, opts?): Agent`
- `CreateAgentOptions { clock?, provider?, rng?, audio? }` — **`provider` defaults to the silent
`StubTtsProvider`.**
- `Agent { show(); hide(); play(name); animations(): string[]; speak(text, opts?); moveTo(x, y, opts?);
gestureAt(x, y); stopCurrent(); stop(); on(event, handler); dispose() }` — every action enqueues
and runs in order.
- `SpeakOptions { hold?, provider? }`, `MoveOptions { speed? }`, `AgentEvent` union,
`CharacterBundleRef { manifestUrl }`. Also exported: `ActionQueue`, `Playback`, `WebAudioSink`,
`StubTtsProvider`, lip-sync helpers; the `@vivify/types` contracts are re-exported.

**TtsProvider seam** (`packages/types/src/index.ts`):
- `interface TtsProvider { speak(text, voice: VoiceConfig, signal?: AbortSignal): Promise<TtsResult> }`
- `TtsResult { audio: ArrayBuffer; mouthTimeline: MouthEvent[] }`
- Implementations: `StubTtsProvider` (silent default, in core), `WebSpeechProvider` (browser fallback,
in voice-truvoice), `TruVoiceProvider` (authentic, POSTs `${url}/tts`).

**Commands (only these exist — verified against `package.json`):**
- Node `>=20`, `pnpm@9.15.0`; workspaces `packages/*`, `services/*`, `apps/*`.
- Root: `pnpm -r typecheck`, `pnpm -r test`, `pnpm lint` (`eslint .`), `pnpm format`
(`prettier --check .`), `pnpm format:write`.
- MASH dev: `pnpm --filter mash dev` (Vite).
- `acs2bundle`: bin in `@vivify/acs` → `pnpm --filter @vivify/acs exec acs2bundle <input.acs> <outDir>`
(usage string in `cli.ts`: `acs2bundle <input.acs> <outDir>`).
- Voice server (Node): `pnpm --filter @vivify/voice-server build && … start`. Docker: `docker compose
up` (compose service `voice` :8080, `mash` :8090).

## Honesty call — packages are not published to npm (handled, not papered over)
Every `@vivify/*` package is `"private": true` at `"version": "0.0.0"` — **none are on npm**. The
spec's "`npm i` → talking character" is the destination, not today's reality. Per
[ADR-0025](../decisions/0025-repo-front-door.md) ("ship only what's real, signpost the rest, no
polish-theater") and the CLAUDE.md honesty rule, `quickstart.md`:
- leads with the **working** path today — clone, `pnpm install`, use the workspace packages / run MASH
(`apps/mash/src/app.ts` is itself a live, ~10-line use of the public API);
- shows the canonical embed snippet (real, correct API) as the shape consumers will use;
- frames the `npm i …` line as **"once published,"** not a command that works today.

The other four pages have no such gap — the API, types, CLI, and providers all exist and are verified.

## Pages written
- `docs/developers/overview.md` (heavy lift): what it is architecturally (packages table + data-flow,
linking `architecture.md`); the runs-anywhere pitch (framework-agnostic core, the `TtsProvider`
seam, browser voice needs nothing) with the real public-API list; get-running-fast from source; how
to extend; contributing/conventions (summarized from CLAUDE.md). One set-off "screenshots coming"
note; nav footer.
- `docs/developers/quickstart.md`: honest install path → the ~10-line embed using
`WebSpeechProvider` for an audible zero-backend default (with the silent-default caveat explained).
- `docs/developers/api.md`: the verified surface, signatures copied from source; action-queue
semantics (`stopCurrent` vs `stop`).
- `docs/developers/providers.md`: the `TtsProvider` interface + `TtsResult`/`MouthEvent`; the three
implementations; how to write your own; fallback vs authentic.
- `docs/developers/bundles.md`: what a bundle is (`sheet.png` + `manifest.json` + `audio/`), the
`acs2bundle` invocation, raw-`.acs`-in-browser vs ahead-of-time bundle, hosting via
`CharacterBundleRef { manifestUrl }`.

**ADRs referenced:** 0001 (monorepo), 0003 (superset IR/bundle), 0004 (voice needs a backend), 0005
(pluggable provider), 0006 (MIT, no bundled IP), 0007 (framework-agnostic core), 0012 (core depends on
acs for the raw-`.acs` path), 0013 (MASH vanilla TS).

## Acceptance check
- All five `docs/developers/*` pages contain real content (no "🚧 Coming soon" placeholder) and read
in the developer voice.
- Every documented API signature matches `packages/core/src/{agent,types}.ts` /
`packages/types/src/index.ts`; every command exists in the relevant `package.json`.
- No `npm i` line is presented as working today; the not-yet-published status is stated plainly.
- Every relative link in the new pages resolves (architecture, ADRs, CLAUDE.md, sibling dev pages,
`voice/overview.md`, docs home, main README).

## Verification
- **CI (this repo):** `pnpm -r typecheck && pnpm -r test && pnpm lint && pnpm format` stays green.
Docs only, and Markdown is prettier-ignored, so the change touches no code path.
- **Reviewer (`code-reviewer`):** verifies every signature against source, every command against
`package.json`, every link resolves, and the not-published framing is honest — trusting nothing from
the draft.
- **Operator/PO:** read `docs/developers/overview.md` on GitHub for voice + correctness; follow
`quickstart.md` against a local clone.

## Non-goals
Screenshots / GIFs (next cycle — signposted, not faked). No code changes. No package publishing / no
flipping `private: false`. No edits to `architecture.md`, the ADRs, or `CLAUDE.md` (link only). No
merge — open a PR (base `main`) and stop.
147 changes: 141 additions & 6 deletions docs/developers/api.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,149 @@
# Engine API reference

> 🚧 **Coming soon.** This page lands in **Cycle 16**. It's a placeholder for now, so links
> pointing here already work — no dead ends.
The public surface of `@vivify/core`. Signatures here are copied from the source
(`packages/core/src/agent.ts` and `types.ts`); if you find drift, the source wins — please flag it.

**What it'll cover:** the full `@vivify/core` API — every method on the Agent control.
New here? The **[quickstart](quickstart.md)** shows this API in action in about ten lines.

In the meantime:
## Creating an agent

- New to all of this? Start with **[What is this?](../what-is-this.md)**.
- Want to try it right now? See the **[main README](../../README.md)**.
```ts
function createAgent(
source: ArrayBuffer | CharacterBundleRef,
mount?: HTMLElement,
opts?: CreateAgentOptions,
): Promise<Agent>;

function createAgentFromModel(
model: CharacterModel,
mount?: HTMLElement,
opts?: CreateAgentOptions,
): Agent;
```

- **`createAgent`** is the usual entry point. `source` is either a raw `.acs` as an `ArrayBuffer` or a
`CharacterBundleRef` pointing at a prebuilt bundle (see **[bundles](bundles.md)**). It loads the
character, then mounts. Resolves once the character is ready.
- **`createAgentFromModel`** skips the loader when you already have a decoded `CharacterModel` (for
example, straight from `@vivify/acs`'s `parseAcs`). `createAgent` calls this for you after loading.
- **`mount`** is the host element the engine renders into. Omit it and the engine falls back to
`document.body`.

### `CreateAgentOptions`

```ts
interface CreateAgentOptions {
clock?: Clock; // Injectable clock (default: real timers).
provider?: TtsProvider; // Default TTS provider (default: silent StubTtsProvider).
rng?: Rng; // RNG for branch/state selection (default: Math.random).
audio?: AudioSink; // Audio playback sink (default: Web Audio; tests inject a fake).
}
```

The default `provider` is the **silent** `StubTtsProvider` — pass a real one (e.g.
`WebSpeechProvider` or `TruVoiceProvider`) for audible speech. See **[TTS providers](providers.md)**.

### `CharacterBundleRef`

```ts
interface CharacterBundleRef {
manifestUrl: string; // URL of the bundle's manifest.json.
}
```

## The `Agent` control

Returned by `createAgent` / `createAgentFromModel`. It mirrors the classic Microsoft Agent control:
**every action enqueues and runs in order.**

```ts
interface Agent {
show(): Promise<void>;
hide(): Promise<void>;
play(animationName: string): Promise<void>;
animations(): string[]; // Names of the character's available animations.
speak(text: string, opts?: SpeakOptions): Promise<void>;
moveTo(x: number, y: number, opts?: MoveOptions): Promise<void>;
gestureAt(x: number, y: number): Promise<void>;
stopCurrent(): void; // Drop the currently-running queued action.
stop(): void; // Clear the queue and return to idle.
on(event: AgentEvent, handler: (...a: unknown[]) => void): void;
dispose(): void;
}
```

### The action queue

The methods that return `Promise<void>` (`show`, `hide`, `play`, `speak`, `moveTo`, `gestureAt`)
**enqueue** work. Calls play strictly in order, so you can fire a sequence without chaining promises
yourself:

```ts
agent.play('Wave'); // runs first
agent.speak('Hi there!'); // then this
agent.moveTo(300, 200); // then this
```

`await` a returned promise when you need to know a specific action finished. Two ways to interrupt:

- **`stopCurrent()`** drops only the action running right now; the rest of the queue continues.
- **`stop()`** clears the whole queue and returns the character to idle.

`animations()` returns the available animation names (synchronously) — handy for building a UI or for
validating a name before you `play` it. `dispose()` tears the agent down and releases its host
element; call it when you're done (e.g. a React effect cleanup).

### `SpeakOptions`

```ts
interface SpeakOptions {
hold?: boolean; // Keep the balloon up after speaking instead of auto-hiding.
provider?: TtsProvider; // Override the TTS provider for this one utterance.
}
```

`provider` here overrides the agent's default provider for a single `speak` call — useful to mix
silent and authentic speech, or to point one line at a different voice.

### `MoveOptions`

```ts
interface MoveOptions {
speed?: number; // Movement speed (pixels/second); the engine picks a default if omitted.
}
```

### Events

```ts
type AgentEvent =
| 'show'
| 'hide'
| 'play'
| 'speak'
| 'move'
| 'gesture'
| 'idle'
| 'command'
| 'error';
```

Register handlers with `agent.on(event, handler)` to react as actions run (for example, re-enabling a
button on `'idle'`, or surfacing an `'error'`).

## Also exported

`@vivify/core` re-exports the shared contracts from `@vivify/types` for convenience
(`CharacterModel`, `VoiceConfig`, `TtsProvider`, `TtsResult`, `MouthEvent`, and the model types), and
exposes the engine building blocks for advanced use: `StubTtsProvider`, `ActionQueue`, `Playback`,
`WebAudioSink`, and the lip-sync helpers. The everyday path needs none of these — `createAgent` and the
`Agent` methods above are the whole story.

## Where to next

- **[TTS providers](providers.md)** — the `TtsProvider` seam in detail.
- **[Character bundles](bundles.md)** — what a `CharacterBundleRef` points at, and how to build one.
- **[Quickstart](quickstart.md)** — the API in a runnable snippet.

---

Expand Down
85 changes: 79 additions & 6 deletions docs/developers/bundles.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,87 @@
# Character bundles

> 🚧 **Coming soon.** This page lands in **Cycle 16**. It's a placeholder for now, so links
> pointing here already work — no dead ends.
A character can run two ways. The browser path is the simplest: hand `@vivify/core` a raw `.acs` as an
`ArrayBuffer` and it parses and plays it in memory ([ADR-0012](../decisions/0012-core-depends-on-acs.md)).
But parsing a `.acs` on every page load means shipping the whole binary and decoding it client-side. A
**bundle** is the ahead-of-time alternative: convert once, serve static web assets, load fast — and
host them on a CDN.

**What it'll cover:** the `acs2bundle` CLI and hosting prebuilt characters on a CDN.
## What a bundle is

In the meantime:
Run the `acs2bundle` CLI (from `@vivify/acs`) and you get a folder with three things:

- New to all of this? Start with **[What is this?](../what-is-this.md)**.
- Want to try it right now? See the **[main README](../../README.md)**.
```
<outDir>/
sheet.png a packed, transparent sprite-sheet of every unique image in the character
manifest.json the full CharacterModel minus the pixels and WAVs (zod-validated)
audio/000.wav each embedded sound, extracted (only present if the character has sounds)
audio/001.wav
...
```

The `manifest.json` is the serialized superset IR — animations, frame branching, balloon and voice
config, the mouth-overlay data, the state map, plus an atlas mapping each sprite to its rectangle in
`sheet.png`. It's the same `CharacterModel` the in-browser parser produces, just split from its binary
assets so the browser can fetch them as ordinary files. (Why a superset and not a clippy-compatible
format? [ADR-0003](../decisions/0003-superset-bundle-format.md).)

## Converting a `.acs`

`acs2bundle` is the CLI shipped by `@vivify/acs`. From the repo:

```bash
pnpm --filter @vivify/acs exec acs2bundle <input.acs> <outDir>
```

For example:

```bash
pnpm --filter @vivify/acs exec acs2bundle ./Genie.acs ./public/characters/genie
```

It prints a summary (image / animation / sound counts and the sheet dimensions) and writes the three
outputs into `<outDir>`. It's Node-only — it reads the `.acs` from disk and writes PNG/JSON/WAV files —
so it's an ahead-of-time build step, not something you run in the browser.

> You supply the `.acs` file. vivify never ships character files or engine binaries — they're
> gitignored, and you bring your own. See **[Legal & assets](../legal-and-assets.md)**.

## Hosting on a CDN

Serve the `<outDir>` as static files (any web host or CDN), then point the engine at the manifest with
a `CharacterBundleRef`:

```ts
import { createAgent } from '@vivify/core';

const agent = await createAgent(
{ manifestUrl: 'https://cdn.example.com/characters/genie/manifest.json' },
document.getElementById('stage')!,
);
await agent.show();
```

`createAgent` accepts **either** a raw `.acs` `ArrayBuffer` **or** a `CharacterBundleRef` — the rest of
the API is identical from there. The engine fetches the manifest, sprite sheet, and audio from the URL
you gave it. Keep the `sheet.png` and `audio/` files next to the `manifest.json` (the manifest
references them by relative name), and you're serving characters straight from the edge.

## Raw `.acs` vs bundle — which to use

| | Raw `.acs` `ArrayBuffer` | Prebuilt bundle (`CharacterBundleRef`) |
| --- | --- | --- |
| Setup | none — just fetch the file | one `acs2bundle` build step |
| Client work | parses the binary in the browser | fetches ready-made static assets |
| Best for | quick experiments, user-uploaded files | production, CDN delivery, many page loads |

Both go through the exact same engine and the exact same `Agent` API — the only difference is where the
parsing happens.

## Where to next

- **[API reference](api.md)** — `createAgent` and `CharacterBundleRef`.
- **[Quickstart](quickstart.md)** — the raw-`.acs` path in a runnable snippet.
- **[Architecture](../architecture.md)** — where the bundle sits in the data flow.

---

Expand Down
Loading
Loading