diff --git a/skills/argos-cli/SKILL.md b/skills/argos-cli/SKILL.md index e8be1dca..33cf7b24 100644 --- a/skills/argos-cli/SKILL.md +++ b/skills/argos-cli/SKILL.md @@ -2,9 +2,9 @@ name: argos-cli description: > Operate the Argos visual regression platform from the terminal — fetch build - metadata, review snapshot diffs, upload screenshots, and manage CI builds via - the `argos` CLI. Use when the user wants to run Argos commands in the shell, - scripts, or CI/CD pipelines, or when reviewing visual regression builds. + metadata, inspect snapshot diffs, upload screenshots, and manage CI builds + via the `argos` CLI. Use when the user wants to run Argos commands in the + shell, scripts, or CI/CD pipelines. Always load this skill before running `argos` commands — it contains the flag contract and output shapes that prevent silent failures. license: MIT @@ -19,14 +19,8 @@ argument-hint: Requires `ARGOS_TOKEN` or an explicit `--token` when running auth ## Agent Protocol -The CLI writes errors to stderr. Some commands support both human-readable text -and JSON output. - -**Rules for agents:** - - Supply `--token` or set `ARGOS_TOKEN`. The CLI exits with code 1 if no token is found. -- Exit `0` = success, `1` = error. -- All errors go to stderr: `Error: ` +- Exit `0` = success, `1` = error. All errors go to stderr: `Error: ` - Use `--json` whenever stdout will be parsed by a script or agent. ## Authentication @@ -43,19 +37,41 @@ Auth resolves: `--token` flag > `ARGOS_TOKEN` env var. | `finalize` | Finalize a parallel build | | `skip` | Mark a build as skipped | -Read the matching reference file for detailed flags and output shapes. +## Build Status Reference -## Common Patterns +| Status | Terminal? | Meaning | +| ------------------ | --------- | ----------------------------------- | +| `no-changes` | ✅ | All snapshots identical to baseline | +| `accepted` | ✅ | All changes approved | +| `rejected` | ✅ | Changes explicitly rejected | +| `changes-detected` | ✅ | Diffs found, awaiting decision | +| `error` | ✅ | Build processing failed | +| `aborted` | ✅ | Build was cancelled | +| `expired` | ✅ | Build expired before completion | +| `pending` | ⏳ | Waiting for screenshots to arrive | +| `progress` | ⏳ | Screenshots are being compared | -**Review a build (fetch metadata first, then diffs that need review):** +Only `no-changes` and `accepted` are all-clear terminal states. For review +decision-making, keep the PR-review triage logic in the dedicated Argos review +skill rather than here. + +## Snapshot Inspection + +Use the CLI to fetch the build and the raw diffs that need attention: ```bash -argos build get 72652 -argos build snapshots 72652 --needs-review -argos build get 72652 --json -argos build snapshots 72652 --needs-review --json +argos build get --json +argos build snapshots --needs-review --json ``` +This skill is the command-and-output reference for those operations. If you need +the full "review an Argos build as part of a PR review" workflow, keep that in +the dedicated public Argos review skill rather than here. + +--- + +## Common Patterns + **Upload screenshots in CI:** ```bash @@ -69,6 +85,8 @@ argos upload ./screenshots --parallel-nonce $CI_PIPELINE_ID --parallel-index $CI argos finalize --parallel-nonce $CI_PIPELINE_ID ``` +--- + ## When to Load References - **Fetching build data or reviewing snapshots** → [references/build.md](references/build.md) diff --git a/skills/argos-cli/agents/openai.yaml b/skills/argos-cli/agents/openai.yaml new file mode 100644 index 00000000..9ab1e6aa --- /dev/null +++ b/skills/argos-cli/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Argos CLI" + short_description: "Use the Argos CLI safely and correctly" + default_prompt: "Use $argos-cli to run or interpret Argos CLI commands in this repository." diff --git a/skills/argos-pr-review/SKILL.md b/skills/argos-pr-review/SKILL.md new file mode 100644 index 00000000..bfdb1706 --- /dev/null +++ b/skills/argos-pr-review/SKILL.md @@ -0,0 +1,173 @@ +--- +name: argos-pr-review +description: > + Review Argos visual regression builds as part of a pull request review. Use + this skill when a PR includes an Argos build URL, an Argos status check, or a + bot comment linking to an Argos build, and you need to decide whether the + visual diffs are intentional changes, regressions, or flaky captures before + approving the PR. +--- + +# Argos PR Review + +Treat the Argos build as part of the PR review, not as a separate task. + +## When To Use This Skill + +- A PR includes an Argos build link. +- A PR has an Argos check with `changes-detected`. +- An Argos bot comment says screenshots are waiting for review. + +## Review Workflow + +### 1. Inspect the build status + +Use the Argos CLI with the build URL or build number. + +```bash +argos build get --json +``` + +Use the build status to decide the next review step: + +| Status | Review meaning | Next step | +| ------------------ | ------------------------------------------- | ---------------------------------------------- | +| `accepted` | Diffs already approved | No visual review needed | +| `no-changes` | No visual diff against baseline | No visual review needed | +| `pending` | Build is not ready yet | Wait and check again | +| `progress` | Comparison is still running | Wait and check again | +| `changes-detected` | Screenshots need a decision | Fetch snapshots and inspect them | +| `rejected` | A prior decision already rejected the build | Do not approve without understanding why | +| `error` | Build processing failed | Do not approve until the failure is understood | +| `aborted` | Build did not complete normally | Do not approve until the reason is understood | +| `expired` | Build was not completed in time | Do not approve until a valid build exists | + +### 2. Fetch only snapshots that need review + +```bash +argos build snapshots --needs-review --json +``` + +For each diff, inspect: + +- the diff mask `url` +- the previous screenshot `base.url` +- the new screenshot `head.url` +- metadata in `head.metadata` + +### 3. Decide whether the change is intentional, broken, or flaky + +- Intentional change: the new screenshot matches the code change and the UI is stable. +- Regression: layout break, wrong state, missing content, wrong theme, clipped content, incorrect data, or an unexpected removed snapshot. +- Flaky capture: loading indicator, partially rendered content, animation frame, dynamic content drift, or the same transient state captured in multiple browsers. + +### 4. Always check for flake signals + +Pay special attention to: + +- `head.metadata.test.retry > 0` +- `head.metadata.test.retries` only as context for how many retries the test is allowed to use +- a visible loader or spinner in the screenshot +- partially loaded data +- identical scores across browsers +- identical `head.url` values across browsers + +If the page is clearly captured mid-load, report it as flaky even if the branch name sounds related to loading or data changes. + +### 5. Report the result in the PR review + +- If everything looks intentional, say the Argos build looks good. +- If you find a regression, call it out with the Argos build URL and the affected snapshot names. +- If you find flakiness, explain the signal and recommend a stabilization fix before approval. + +Use this format for flaky captures: + +> `snapshot-name` looks flaky. +> Test: `test.title` in `test.location.file:test.location.line` +> Signal: loader visible / retry > 0 / identical head image across browsers / ... +> Fix: wait for the page to finish loading. If the SDK uses `waitForAriaBusy`, mark loading UI with `aria-busy="true"` so Argos waits for it to clear +> Review: `` + +## What To Inspect In Snapshot Data + +For each snapshot diff, inspect: + +- `name`: human-readable snapshot identifier +- `status`: usually `changed`, `added`, or `removed` +- `score`: rough diff magnitude +- `url`: diff mask / review overlay +- `base.url`: previous screenshot +- `head.url`: new screenshot +- `head.metadata.test`: test file, line, title, `retry`, `retries` +- `head.metadata.browser`: browser information + +## Review Heuristics + +### Expected changes + +These are usually safe when they match the PR: + +- intentional content updates +- deliberate theme or layout redesigns +- newly added screens +- removed screenshots that correspond to intentionally deleted pages or tests + +### Regressions + +Flag the review when you see: + +- missing content +- wrong empty state +- unexpected loader +- broken layout or overlap +- text clipping +- wrong route or wrong page captured +- mismatched theme or styling +- removed snapshots without an intentional test removal + +### Flaky captures + +Treat the snapshot as likely flaky when you see: + +- a spinner, skeleton, or loading state +- async content not yet rendered +- animation mid-transition +- dynamic values drifting between runs +- `retry > 0` +- identical `score` across multiple browsers +- identical `head.url` across multiple browsers + +Identical `head.url` values across browsers are a particularly strong signal that both browsers captured the same transient broken state. +`retry` is the current retry count for this run. `retries` is only the configured +retry budget, so do not treat `retries > 0` alone as evidence of flakiness. + +## Typical Fixes + +- Loading UI visible: + wait for settled content before capture; if the SDK uses `waitForAriaBusy`, add `aria-busy="true"` to the loading container so Argos waits for it to clear +- Test captures too early: + wait for content that depends on the async data before calling the screenshot helper +- Dynamic values: + freeze the data or mask the unstable element +- JS animation: + stabilize or remove the animated element from the capture + +## Suggested PR Review Wording + +Use short, actionable language. Example: + +```text +The Argos build shows a flaky capture rather than an intentional visual change. +The screenshot is taken before async data has loaded, so the page is captured in +an intermediate state (spinner visible / empty values / incomplete content). +Please wait for settled content before taking the screenshot, or mark the +loading UI with aria-busy="true" if your SDK is configured to wait for it. +``` + +## Tooling + +- Prefer the Argos CLI for build metadata and snapshot enumeration. +- Use `--json` when parsing CLI output. +- Use the PR diff and test code to confirm whether the visual change matches the code. +- Load [references/baseline.md](references/baseline.md) when the review depends on baseline selection or orphan build semantics. +- Load [references/flaky-fixes.md](references/flaky-fixes.md) when you need concrete remediation examples for flaky captures. diff --git a/skills/argos-pr-review/agents/openai.yaml b/skills/argos-pr-review/agents/openai.yaml new file mode 100644 index 00000000..10494206 --- /dev/null +++ b/skills/argos-pr-review/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Argos PR Review" + short_description: "Review Argos builds as part of PR review" + default_prompt: "Use $argos-pr-review to review this pull request when an Argos build is available." diff --git a/skills/argos-pr-review/references/baseline.md b/skills/argos-pr-review/references/baseline.md new file mode 100644 index 00000000..823295f9 --- /dev/null +++ b/skills/argos-pr-review/references/baseline.md @@ -0,0 +1,73 @@ +# Baseline And Orphan Build Notes + +Load this file when an Argos PR review depends on baseline selection, orphan +build semantics, or how approval affects future comparisons. + +--- + +## Baseline Build + +The baseline is the screenshot Argos compares the new build against. + +**Approval vs. baseline:** approving a build makes it a candidate for future +baseline selection. It does not directly overwrite the current baseline. The +baseline is selected dynamically each time a new build runs. + +Do not say: + +```text +Approving this build will update the baseline. +``` + +Say instead: + +```text +Approving this build would make it eligible to be selected as a baseline for future builds on this branch. +``` + +## Baseline Selection Rules + +Argos picks the most recent build satisfying all of: + +1. Same build name as the triggered build +2. All framework tests passed +3. Not a subset build +4. Status is auto-approved, manually approved, or orphan +5. Its commit is an ancestor of the merge base between the new build's commit + and the baseline branch + +## Baseline Branch + +- Pull requests: the PR base branch +- Push events: the project's configured default baseline branch, usually `main` + +## Baseline Overrides + +- `ARGOS_REFERENCE_BRANCH`: force a specific branch as baseline +- `ARGOS_REFERENCE_COMMIT`: pin to a specific commit's build + +## CI Caveat + +If CI does not run on the default branch, the baseline becomes stale and diffs +accumulate. Keep Argos running on the default branch too. + +## Orphan Builds + +An orphan build has no prior build to compare against, so all snapshots appear +as `added` with `base: null`. + +This is expected for: + +- the first build in a project +- a branch that has no approved ancestor build yet + +The absence of a baseline is not itself a regression signal. + +## Review Guidance + +When reviewing an orphan build: + +- inspect screenshots for obvious layout or rendering problems +- do not treat every `added` snapshot as suspicious just because `base` is null +- approve the build if the screenshots look correct and it is intended to seed a + future baseline diff --git a/skills/argos-pr-review/references/flaky-fixes.md b/skills/argos-pr-review/references/flaky-fixes.md new file mode 100644 index 00000000..94ccc373 --- /dev/null +++ b/skills/argos-pr-review/references/flaky-fixes.md @@ -0,0 +1,125 @@ +# Flaky Fix Code Examples + +Code examples for each flakiness fix. Load this file when you need to give a developer the exact implementation for a fix identified during build review. + +--- + +## 1. Loaders / async data — `aria-busy` + +For SDKs that enable `waitForAriaBusy`, `argosScreenshot()` waits until no +element with `aria-busy` is present or truthy before capturing. Apply it to +loader components in those integrations. + +```jsx +// while data is loading + + +// once data is ready, remove the attribute or set to false + +``` + +Plain HTML: + +```html +
+``` + +--- + +## 2. Dynamic dates and times + +**Option A — hide the element (preserves layout):** + +```html + +``` + +**Option B — freeze dates in test seeds:** +Run a normalization script before the E2E suite that sets all date fields to a fixed value. This is preferable when the date drives layout (e.g. affects column widths or sort order). + +--- + +## 3. Dynamic content — `data-visual-test` attributes + +| Attribute | CSS effect | When to use | +| -------------------------------- | ----------------------- | -------------------------------------------------------------------------- | +| `data-visual-test="transparent"` | `visibility: hidden` | Hide content, keep layout space | +| `data-visual-test="removed"` | `display: none` | Hide content and collapse layout space | +| `data-visual-test="blackout"` | Black mask overlay | Obscure sensitive or dynamic content (Playwright, Cypress, Storybook only) | +| `data-visual-test-no-radius` | Removes `border-radius` | Fix cross-browser rounding glitches | + +```html + + + + +
+ + + + + + +``` + +--- + +## 4. Animations and transitions + +The Argos SDK automatically stabilizes CSS animations (pauses them at a consistent frame before capture). For JS-driven animations that are still causing flakiness, hide the element: + +```html +
+``` + +Or disable the animation in test via CSS injection: + +```ts +await argosScreenshot(page, "my-page", { + argosCSS: + "*, *::before, *::after { animation: none !important; transition: none !important; }", +}); +``` + +--- + +## 5. Border-radius glitches + +Cross-browser and cross-OS rendering can produce 1px differences around rounded corners. Strip the radius for the screenshot: + +```html + +``` + +--- + +## 6. `argosScreenshot` stabilization options (Playwright) + +The stabilization behavior can be customized per screenshot: + +```ts +await argosScreenshot(page, "name", { + stabilize: { + waitForAriaBusy: true, // wait for aria-busy to clear (default: true) + waitForFonts: true, // wait for fonts to load (default: true) + waitForImages: true, // wait for images to load (default: true) + hideScrollbars: true, // hide scrollbars (default: true) + hideCarets: true, // hide text carets (default: true) + fontAntialiasing: true, // force consistent font antialiasing (default: true) + stabilizeSticky: true, // make sticky/fixed elements position:absolute (default: true) + }, +}); +``` + +To debug a specific test by running it multiple times to surface inconsistencies: + +```sh +npx playwright test --repeat-each 5 +``` + +## 7. Framework caveat + +`aria-busy` is not a universal fix across every Argos integration. For example, +some integrations may disable `waitForAriaBusy` by default, so waiting for +settled content explicitly can be a more portable recommendation than relying +only on `aria-busy`.