feat(ship): stack-aware test execution by inchwormz · Pull Request #842 · garrytan/gstack

inchwormz · 2026-04-05T22:08:33Z

TL;DR

/ship only works on Rails projects today. On anything else (Next.js, Python, Go, Rust, PHP, Elixir, plain Node) it tells Claude to run bin/test-lane, the file does not exist, and the ship stops. This PR makes /ship detect the project at skill-run time and emit the right test instructions for that project. Rails projects see no change. Every other project goes from broken to working.

The problem, concretely

ship/SKILL.md.tmpl hard-codes the Rails test flow inside Step 3 "Run tests". The template says things like bin/test-lane, db:test:prepare, app/services/*_prompt_builder.rb, and EVAL_JUDGE_TIER=full. These are Rails-only commands. A user running /ship on a Next.js app still gets the same generated text, because the template has no branching. Claude then tries bin/test-lane on a repo where that file does not exist, and the ship halts.

You can verify this on main:

grep -n "bin/test-lane" ship/SKILL.md.tmpl
# 10 hits, all inside Step 3
grep -n "{{.*STACK.*}}\|SHIP_STACK" ship/SKILL.md.tmpl
# 0 hits

There is no stack awareness. The template assumes Rails.

The fix

A new resolver function, generateShipTestExecution in scripts/resolvers/testing.ts. It returns the full "Step 3: Run tests" block, with stack detection bash at the top and two branches below.
A new placeholder, {{SHIP_TEST_EXECUTION}}, registered in scripts/resolvers/index.ts.
ship/SKILL.md.tmpl now contains a single {{SHIP_TEST_EXECUTION}} line where the 86 Rails-specific lines used to live.
The generator (bun run gen:skill-docs) expands the placeholder into the same Rails block as before plus a new generic-fallback block plus stack detection bash.

Detection happens at skill-run time, not generator time. The generated ship/SKILL.md contains bash that sniffs the current repo and prints SHIP_STACK: rails or SHIP_STACK: generic. That means one generated skill works across every project, without per-project regeneration.

Rails projects: byte-for-byte the same

The Rails block is moved, not rewritten. Every line that used to live at the top of Step 3 now lives inside ### If SHIP_STACK: rails. That includes:

bin/test-lane 2>&1 | tee /tmp/ship_tests.txt (plus the parallel npm run test line)
The RAILS_ENV=test bin/rails db:migrate warning about corrupting structure.sql
The db:test:prepare explanation
The Step 3.25 Eval Suites section (app/services/*_prompt_builder.rb, *_generation_service.rb, *_evaluator.rb, config/system_prompts/*.txt, test/evals/**/* globs)
The PROMPT_SOURCE_FILES grep pattern
EVAL_JUDGE_TIER=full EVAL_VERBOSE=1 bin/test-lane --eval and the fast/standard/full tier reference table
The "Pre-merge gate uses full tier" rule

You can verify the Rails content survived intact:

# Count Rails markers in committed main
git show main:ship/SKILL.md | grep -c "bin/test-lane"
# 4

# Count on this branch
git show HEAD:ship/SKILL.md | grep -c "bin/test-lane"
# 4

Same for PROMPT_SOURCE_FILES (1:1), db:test:prepare (1:1), EVAL_JUDGE_TIER=full (2:2).

Detection uses the same Gemfile grep rails pattern that generateTestBootstrap already uses at scripts/resolvers/testing.ts:20. No new convention.

Generic branch: what happens on a non-Rails project

The generic branch runs when Gemfile is missing, or Gemfile exists but does not contain rails. It walks a detection order:

package.json scripts. If present, run the first match of test:ci, test:all, or test.
Makefile target. If present with a test or check target, run make test or make check.
Language default. go test ./... for Go, cargo test for Rust, pytest for Python, bundle exec rspec for non-Rails Ruby, and so on. Only reached if steps 1 and 2 found nothing.

If nothing matches, /ship uses AskUserQuestion instead of guessing. Options: run a specific command, skip tests this ship (with a loud warning in the PR body), or type a custom command.

Proof: same command, two projects, two outputs

Rails repo:

$ head -1 Gemfile
source 'https://rubygems.org'
$ grep -c rails Gemfile
3
# /ship emits the Rails branch:
#   bin/test-lane 2>&1 | tee /tmp/ship_tests.txt
#   EVAL_JUDGE_TIER=full EVAL_VERBOSE=1 bin/test-lane --eval test/evals/<suite>_eval_test.rb
#   "Rails tests pass (N runs, 0 failures)"

Next.js repo (same /ship command):

$ ls Gemfile 2>/dev/null
$ node -e "console.log(require('./package.json').scripts.test)"
next test
# /ship emits the generic branch:
#   next test 2>&1 | tee /tmp/ship_tests.txt
#   (no bin/test-lane, no eval suites)

Before this change, the Next.js run followed Rails-only instructions and errored on a missing bin/test-lane.

Files changed

File	Change
`scripts/resolvers/testing.ts`	New `generateShipTestExecution` function (about 150 lines). Imports `generateTestFailureTriage` from `./preamble` and calls it once at the bottom so the triage appears after both branches.
`scripts/resolvers/index.ts`	Registers `SHIP_TEST_EXECUTION: generateShipTestExecution` in the RESOLVERS record.
`ship/SKILL.md.tmpl`	86 lines of Rails-specific bash and prose replaced with a single `{{SHIP_TEST_EXECUTION}}` line.
`ship/SKILL.md`	Regenerated via `bun run gen:skill-docs`. 184 lines changed.
`README.md`	Short "Stack-aware skills" section near the end, listing the two branches and the detection logic. Trim further or drop if you prefer.
`CHANGELOG.md`	Unreleased entry with a "Rails projects: no behavior change" callout.
`CONTRIBUTING.md`	Short note under "Editing SKILL.md files" explaining how to add a new stack branch.

Adding a new stack later

Example: adding a dedicated Next.js branch.

Add a detection line to the _STACK="generic" bash block in generateShipTestExecution:

[ -f package.json ] && grep -q '"next"' package.json 2>/dev/null && _STACK="nextjs"

Add a matching ### If SHIP_STACK: nextjs section inside the template literal with Next.js-specific instructions.
Leave the generic fallback as the last branch.

CONTRIBUTING.md has a short version of this recipe.

Not touched

To keep the review surface small, this PR does not touch any of the following:

Hooks
Config schema (.gstack/config.yaml)
Generator core (scripts/gen-skill-docs.ts, scripts/host-config.ts)
Any skill other than /ship
Telemetry
CI

The only runtime behavior change is inside Step 3 of /ship. Everything else in ship/SKILL.md (preamble, Step 0 through Step 2, Step 3.4 onward) is untouched.

Testing done locally

bun run gen:skill-docs runs green. No unresolved placeholders in the output.
ship/SKILL.md regenerated to 2290 lines, down from 2396 after removing a duplicated ## Test Failure Ownership Triage section that was accidentally emitted twice in an earlier draft.
Rails content verified byte-parity against origin/main for every marker (bin/test-lane, PROMPT_SOURCE_FILES, db:test:prepare, EVAL_JUDGE_TIER=full, tier reference table).
Stack detection bash tested against both a Rails Gemfile (matches) and a Next.js package.json with no Gemfile (falls through to generic).

Test plan for reviewer

bun run gen:skill-docs rebuilds ship/SKILL.md with no unresolved placeholders.
grep -n "## Test Failure Ownership Triage" ship/SKILL.md shows exactly one match.
Rails repo: running /ship prints the bin/test-lane and eval-suite block unchanged.
Next.js repo: running /ship prints the generic branch and runs next test (or whatever package.json scripts.test contains).
Project with no Gemfile, no package.json, and no Makefile: /ship falls through to AskUserQuestion instead of running a Rails command.

Open questions for the maintainer

Dedicated Next.js branch in a follow-up? The generic branch handles Next.js via package.json scripts.test, which is fine. A dedicated branch could add Playwright detection and next build as a pre-test gate. Happy to ship that as PR 2 if you want it.
README section placement. The "Stack-aware skills" blurb lives between "Docs" and "Privacy & Telemetry". If you prefer it somewhere else, or prefer it removed entirely and only mentioned in CHANGELOG, say the word.
The retro/review templates. Two files (retro/SKILL.md.tmpl, review/SKILL.md.tmpl) contain Rails references in example output only, not in runtime instructions. I left them alone in this PR. Flag if you want a follow-up sweep.

/ship test execution now detects the project stack at skill-run time and emits instructions that fit. Rails projects (Gemfile contains "rails") get the same bin/test-lane, db:test:prepare, and app/services/*_prompt_builder.rb eval-suite flow as before. Any other project gets a generic path that finds the project's own test command via package.json scripts, a Makefile target, or language defaults, then falls back to AskUserQuestion when nothing is detectable. Previously, every project running /ship followed Rails-only instructions and errored on a missing bin/test-lane. The Rails block is moved, not rewritten. bin/test-lane, the db:test:prepare warning, the eval-suite runner pattern, and the EVAL_JUDGE_TIER=full tier reference all sit byte-for-byte inside the SHIP_STACK: rails branch. Detection uses the same Gemfile grep that the existing generateTestBootstrap resolver uses. Files: - scripts/resolvers/testing.ts: new generateShipTestExecution resolver with Rails + generic branches - scripts/resolvers/index.ts: register SHIP_TEST_EXECUTION - ship/SKILL.md.tmpl: replace 86 lines of Rails-specific content with {{SHIP_TEST_EXECUTION}} - ship/SKILL.md: regenerate - README.md: stack-aware skills section with Rails vs Next.js proof - CHANGELOG.md: Unreleased entry - CONTRIBUTING.md: note on adding a new stack branch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ship): stack-aware test execution#842

feat(ship): stack-aware test execution#842
inchwormz wants to merge 1 commit intogarrytan:mainfrom
inchwormz:feat/stack-aware-ship-test-execution

inchwormz commented Apr 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

inchwormz commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

The problem, concretely

The fix

Rails projects: byte-for-byte the same

Generic branch: what happens on a non-Rails project

Proof: same command, two projects, two outputs

Files changed

Adding a new stack later

Not touched

Testing done locally

Test plan for reviewer

Open questions for the maintainer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

inchwormz commented Apr 5, 2026 •

edited

Loading