Testing

ICE has four categories of tests. This page covers how to run each, what each protects, and when to write which.

Quick reference

pnpm test:unit          # Vitest unit tests - no DB, no network
pnpm test:int           # Integration tests - hits a local SQLite DB
pnpm test:e2e           # Playwright E2E against a running web app
pnpm test:gcp           # GCP integration tests - requires real GCP creds
pnpm test:scenarios     # Declarative-YAML deployment scenarios - requires real GCP creds
pnpm test:dashboard     # Interactive GCP test dashboard (http://localhost:15200)
pnpm verify:sdk         # L1-L4 SDK verifier - static analysis, no creds
pnpm test:live:<p> <s>  # Per-handler real-cloud round-trip - requires <p> creds
pnpm typecheck          # TypeScript across all packages
pnpm lint:check         # ESLint, errors only
pnpm format:check       # Prettier

CI runs typecheck, lint:check, format:check, test:unit, and a web build on every PR. E2E runs separately against a Postgres+Redis service container - see .github/workflows/e2e.yml.

Unit tests (Vitest)

Fast, in-process, no DB. Live next to the code they test as *.test.ts / *.test.tsx.

pnpm test:unit                        # all
pnpm --filter @ice/core test          # one package
pnpm --filter @ice/ui test -- --watch # watch mode for one package

Write unit tests for: pure logic (graph algorithms, schema validation, translators, Redux slice reducers, utility helpers).

Don't write unit tests for: anything that needs a real DB or a real cloud API - use integration or E2E.

Key suites worth reading to learn the conventions:

packages/core/src/__tests__/card-translator.test.ts - translation table tests.
packages/ui/src/store/slices/__tests__/cards-slice-group.test.ts - slice-level behaviour.
packages/ui/src/config/__tests__/containment-rules.test.ts - block containment rules.

Integration tests

Integration tests use a real Prisma SQLite DB but still run in-process. They're named *.int.test.ts so they're excluded from the default unit run.

pnpm dev:setup        # create the dev DB once
pnpm test:int         # run integration tests

Write integration tests for: Prisma queries, multi-record invariants, RBAC-adjacent code (org isolation), anything that depends on DB transactions.

Example suites:

services/canvas/src/__tests__/org-isolation.int.test.ts - verifies that projects are scoped to their org.
services/canvas/src/__tests__/rbac.int.test.ts - role-based access checks (mostly skipped today because Community Edition is single-user; see architecture overview).

End-to-end tests (Playwright)

pnpm test:e2e               # headless against running app
pnpm test:e2e -- --headed   # see the browser

E2E tests expect a running gateway + web app. In CI, the workflow boots Postgres + Redis + gateway service containers before running the suite (.github/workflows/e2e.yml). Locally, run pnpm dev:all in another terminal first.

Write E2E tests for: full-stack flows the user cares about - "drag block, configure, save, reload, still there." Keep them few and precious.

Suites live under e2e/ in the repo root.

GCP integration tests

These are the opt-in heavy tests - they spin up real GCP resources and tear them down, template by template.

pnpm test:gcp                    # headless CLI
pnpm test:dashboard              # interactive dashboard UI at :15200

The dashboard (e2e/dashboard/server.ts) provides: template checkboxes, GCP/GitHub configuration, test-repo creation, run/stop controls, live progress, and HTML report generation.

Requirements:

GCP_SERVICE_ACCOUNT_JSON (or a credential file path) - with full admin on a disposable project.
A GitHub personal access token if testing templates that expect a connected repo.
A project budget ceiling - the tests spin up real infra.

Env vars (contributor-only, set in your shell or .env):

Variable	Required	Default	Notes
`ICE_TEST_GCP_PROJECT`	yes	-	Disposable GCP project ID the tests provision in
`ICE_TEST_SA_KEY_PATH`	yes	-	Absolute path to a service-account JSON key with project admin
`ICE_TEST_GCP_REGION`	no	`us-central1`	Region for regional resources
`ICE_TEST_GITHUB_TOKEN`	scenario-dependent	-	PAT with `repo` scope, for scenarios that pull a source repo
`ICE_TEST_DOMAIN`	scenario-dependent	-	Apex domain used by the static-site-with-domain scenario

These never get baked into the app - end users running ICE don't need them. They're read by the Playwright runner only.

What it protects: the entire template library. If "SaaS Starter" stops deploying on GCP, this is the suite that catches it. Running it per-commit would be expensive; it's run on demand and before tagged releases.

Deployment-test scenarios

Complementary to the template suite above. Where test:gcp exercises pre-built templates end-to-end, test:scenarios builds projects from scratch - described in YAML, placed block-by-block via the UI, with per-step JSONL logging and recipe-based recovery for known errors.

pnpm test:scenarios                                    # all scenarios
ICE_SCENARIO_ID=static-site pnpm test:scenarios        # filter by id substring

Credentials (ICE_TEST_GCP_PROJECT, ICE_TEST_SA_KEY_PATH, optional ICE_TEST_GITHUB_TOKEN, ICE_TEST_DOMAIN) are read from the repo-root .env automatically. See the GCP integration tests section above for the canonical list; same vars work for test:gcp and test:scenarios.

Scenarios live in e2e/deployment-tests/scenarios/ as YAML files. Each run writes to test-results/runs/<ts>-<scenarioId>/ with events.jsonl, summary.json, description.md, screenshots, and a self-contained index.html timeline.

Write a scenario when: you want to lock in a specific multi-block configuration end-to-end (e.g. "static site + custom domain on GCP must produce a forwarding rule"), or you want to reproduce a deployment-level bug as a regression test.

Don't use this for: template smoke tests (use test:gcp), unit-level logic (use test:unit).

Full reference - env vars, YAML schema, recipe model, log schema, troubleshooting - is in e2e/deployment-tests/README.md.

Typecheck

pnpm typecheck

Runs tsc --noEmit in every workspace package in dependency order. CI treats any TypeScript error as a blocker. If packages/core fails, downstream packages won't be typechecked - fix the root cause, then re-run.

Lint and format

pnpm lint:check          # errors block; warnings allowed
pnpm lint                # auto-fix what it can
pnpm format:check        # prettier, block on mismatch
pnpm format              # write prettier formatting

ESLint config: eslint.config.js. Prettier config: picked up from .prettierrc / defaults.

Dependency audit

pnpm audit --prod --audit-level=high

Runs in CI with || true so it doesn't block - security advisories surface but don't fail the build. Triage them via the roadmap.

Writing style: what a good test looks like

Name it after the behaviour, not the method. it('rejects a canvas with a cycle') beats it('validate()').
Arrange, Act, Assert - one flow per test.
Fail loudly on the happy path's first surprise. Don't paper over with expect.anything().
No snapshots for large blobs - they drift and nobody reviews the diffs.
Put the setup that matters in the test. Long beforeEach pyramids hide intent.

Frontend component tests

300+ .test.tsx files live under packages/ui/src/**/__tests__/. The dominant pattern is a hand-rolled "tree-walker" - see packages/ui/src/shared/components/__tests__/app-bar.test.tsx for the canonical shape:

Unwrap React.memo via .type to invoke the inner FC directly.
Mock all sub-components and hooks via vi.hoisted({...}).
Render via React's TestRenderer rather than @testing-library/react.

This keeps tests fast and pure (no DOM, no act warnings) at the cost of more boilerplate. We've also got jsdom configured (package.json) and a smaller set of hook tests using a thin custom harness - see packages/ui/src/features/canvas/hooks/__tests__/use-canvas-drop.test.tsx. Both approaches are accepted; pick whichever fits the component under test.

SDK verifier (L1-L4) and live tests

The cardinal rule for any deployer change: a handler is only "done" once a successful real-cloud deploy round-trip is observed. Static checks make sure the SDK call shapes are correct before a real call is made; live tests prove the round-trip.

Static SDK verification

pnpm verify:sdk                  # all providers, all layers
node scripts/verify-sdk-coverage.mjs --provider aws    # L1-L3 only
node scripts/verify-sdk-commands.mjs --provider azure  # L4 only

Layer	What it checks
L1	Every handler is reachable from the provider's `HANDLER_REGISTRY`
L2	Every handler imports a real SDK client (no stub clients)
L3	Every block category mapped to the provider has at least one handler
L4	Each handler's create / update / delete input bodies match the SDK's expected command-input shape — catches PascalCase / camelCase drift, path-param vs body-param mistakes, missing required fields, union-type variants

The verifiers run in CI (scripts/verify-sdk-*.mjs) and exit non-zero on uncovered handlers or SDK-shape mismatches. They run against all 8 providers — GCP, AWS, Azure, Alibaba, OCI, DigitalOcean, IBM, Kubernetes. Five of the eight providers report 100 % coverage; the open gaps are tracked in the verifier output.

Live tests (per-handler real-cloud round-trip)

Every handler under packages/core/src/deploy/providers/<p>/handlers/ has a paired live test at:

packages/core/src/deploy/providers/__tests__/live/<p>-<service>.live.test.ts

These are excluded from pnpm test:unit (the vitest config has '**/*.live.test.{ts,tsx}' in the exclude). Run them explicitly:

pnpm test:live:aws s3               # single service
pnpm test:live:aws s3 sqs lambda    # multiple services
pnpm test:live:azure storage-account
pnpm test:live:alibaba              # all alibaba handlers
pnpm test:live:oci
pnpm test:live:digitalocean
pnpm test:live:ibm
pnpm test:live:kubernetes

Without credentials, every test in the suite skips with a one-line banner explaining which env vars to export. With credentials, each test runs a create + delete round-trip against your own account, tagging every resource with ice:test-run-id=<runId> and appending JSONL events to e2e/<provider>-deployment-tests/runs/<runId>.jsonl.

Cleanup of leaked resources — if a test crashes mid-run, run:

tsx e2e/<provider>-deployment-tests/cleanup-orphans.ts

The cleanup script sweeps for ice:test-run-id=*-tagged resources older than ORPHAN_AGE_HOURS (currently 24) and deletes them. AWS + Azure share the same tag key and age threshold (asserted by cleanup-orphans.test.ts).

These do not run in CI — they touch real cloud, cost real money, and need provider-specific credentials. They're a developer self-serve tool. The deploy gate ticks when a developer or contributor runs the live test on their own account and pastes the JSONL summary into the PR.

Known gaps

No AWS / Azure / preview-tier integration tests analogous to the GCP dashboard — live tests cover round-trips but not full scenarios. See ROADMAP.md.
E2E coverage of the AI chat flow is thin - the SSE stream is mocked in tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Testing

Quick reference

Unit tests (Vitest)

Integration tests

End-to-end tests (Playwright)

GCP integration tests

Deployment-test scenarios

Typecheck

Lint and format

Dependency audit

Writing style: what a good test looks like

Frontend component tests

SDK verifier (L1-L4) and live tests

Static SDK verification

Live tests (per-handler real-cloud round-trip)

Known gaps

See also

Uh oh!

FilesExpand file tree

testing.md

Latest commit

History

testing.md

File metadata and controls

Testing

Quick reference

Unit tests (Vitest)

Integration tests

End-to-end tests (Playwright)

GCP integration tests

Deployment-test scenarios

Typecheck

Lint and format

Dependency audit

Writing style: what a good test looks like

Frontend component tests

SDK verifier (L1-L4) and live tests

Static SDK verification

Live tests (per-handler real-cloud round-trip)

Known gaps

See also