Deep analysis of what would need to change in this repository to fully adopt
gh-aw(GitHub Agentic Workflows) as the execution substrate for the AI agent, in place of the current hand-rolled Bun +pi-coding-agent+ YAML stack.Sources:
githubnext/gh-aw,github-agentic-workflows.md, and the published docs athttps://github.github.com/gh-aw/. This repository is read at v1.0.5 (see.github-gstack-intelligence/VERSION).
gh-aw and github-gstack-intelligence solve overlapping problems —
"run an LLM agent inside GitHub Actions in response to repo events" —
but they are architecturally inverted:
| Dimension | This repo (today) | gh-aw |
|---|---|---|
| Authoring surface | Hand-written YAML workflow + 900-line agent.ts orchestrator + 372-line router.ts |
Markdown file with YAML frontmatter; YAML workflow is a generated .lock.yml artifact |
| Agent runtime | @earendil-works/pi-coding-agent (single CLI) |
Pluggable engine: — Copilot, Claude, Codex, Gemini, OpenCode |
| Write surface | contents: write, issues: write, pull-requests: write on the main job |
Read-only main job; writes go through a separate safe-outputs job with a typed schema |
| Routing | TypeScript router.ts switches on event/label/slash command |
Native on: triggers including on.command: and on.reaction: per workflow file |
| Tools | Implicit — whatever pi exposes + Playwright we install | Declarative tools: block (github, bash, edit, web-fetch, playwright, MCP servers) with allow-listing |
| Network | Unrestricted egress | network: allow-list (defaults, github, custom domains) enforced at workflow level |
| Memory | Git-committed JSONL session files under state/ |
cache-memory: / memory: primitives (file-based, scoped per workflow) |
| Compilation | None — YAML is the source of truth | gh aw compile produces .lock.yml; lock file is what Actions runs |
| Auth | Inline shell calling gh api …/collaborators/<actor>/permission |
roles: frontmatter (admin, maintain, write) enforced by the compiler |
| Bot-loop prevention | Hidden HTML signature + skip in router.ts |
Built-in; agent comments are tagged and filtered automatically |
Pivoting fully to gh-aw means the current YAML workflow disappears,
lifecycle/agent.ts and lifecycle/router.ts cease to exist as code,
and each of the 26 gstack skills becomes a standalone .github/workflows/<skill>.md file
(or a small number of multi-trigger workflow files). The skill prompt content survives
almost verbatim; the orchestration code around it is deleted.
The trade-off: we lose several distinctive properties of the current design
(single drop-in YAML, multi-provider key support, JSONL transcripts visible in git,
session resume across runs, the self-installer, the garrytan/gstack refresh pipeline).
We gain read-only-by-default security, declarative routing, sanitized inputs,
network egress controls, MCP allow-listing, and a much smaller code surface to maintain.
A staged migration (Section 7) is feasible; a clean-slate rewrite is also feasible and probably faster than retrofitting.
Reference points:
.github/workflows/github-gstack-intelligence-agent.yml— single 955-line workflow with 4 jobs:run-install— self-installer/upgrader pulling from this template reporun-refresh-gstack— vendoring pipeline fromgarrytan/gstackrun-agent— the actual AI agentrun-gitpages— publishespublic-site/
.github-gstack-intelligence/lifecycle/agent.ts(917 lines) — fetches event, resumes session, spawnspi, posts comment, commits state, retries on conflictrouter.ts(372 lines) — maps(event_name, comment_body, labels, schedule)→ skill namebrowser.ts(434 lines) — Playwright wrapper for the 8 browser-using skillsrefresh.ts— vendors upstream gstack prompts
.github-gstack-intelligence/skills/*.md— 26 markdown skill prompts (e.g.review.md,cso.md,qa.md) imported fromgarrytan/gstack, each with their own frontmatter (allowed-tools,triggers, etc.).github-gstack-intelligence/config.json— per-skill enabled/disabled flags + trigger metadata.github-gstack-intelligence/.pi/settings.json— LLM provider and model (per the stored memory, this is the source of truth, notconfig.json.defaults).github-gstack-intelligence/state/— git-committedsessions/*.jsonl+issues/<n>.jsonmapping files.github-gstack-intelligence/help/README.md— slash-command reference
Key behaviours worth naming because gh-aw handles them differently:
- One workflow, many skills. A single YAML file is the entry point; all routing is in TS.
- Self-installer. Manual
workflow_dispatchrun copies the agent folder into the user's repo. - Vendored prompts.
refresh.tspulls upstream skills with SHA tracking inskills/source.json. - Provider-agnostic. Seven LLM API key env vars are passed through;
pichooses based on.pi/settings.json. - Cross-run session continuity.
state/sessions/<ts>.jsonlis committed and replayed viapi --session <path>. - Append-only memory.
memory.log merge=uniongit attribute lets parallel runs append without conflict. - Inline auth. Shell step queries
repos/.../collaborators/<actor>/permissionand 👎-reacts on rejection. - Reaction lifecycle. 🚀 added by Authorize step, swapped to 👍/👎 by
agent.tson exit. - Push-retry loop. Up to 10 attempts with
pull --rebase -X theirsbetween tries. - Conditional Playwright. Only installed when the routed skill is in the browser set.
- GitHub Pages publishing.
public-site/deployed on every run. - Sliding-window upgrade. File-by-file copy with ALWAYS / NEVER / DEFAULT categories preserves user edits.
(Distilled from the upstream reference doc.)
Each workflow is a markdown file at .github/workflows/<name>.md with a YAML frontmatter:
on:— standard GH Actions triggers pluson.command:(slash commands),on.reaction:(emoji triggers),on.stop-after:deadlines,forks:controls.engine:—copilot|claude|codex|gemini|opencode(andcustom:for arbitrary scripts).permissions:— same syntax as Actions, but conventions strongly discourage any write permission on the main job.network:— egress allow-list (defaults,github,[domain.com]).tools:— declarative; includesgithub(withmode: gh-proxyorlocal, plustoolsets:like[default, issues, pull_requests]),bash:(with a narrow allow-list of binaries),edit:,web-fetch:,playwright:, and arbitrarymcps:blocks (with SHA-pinned docker images, allowed-tools, env mapping).safe-outputs:— typed declarations of the only writes the agent is permitted to perform:add-comment,create-issue,update-issue,add-labels,create-pull-request,create-pull-request-review-comment,create-discussion,assign-to-agent,upload-artifact, etc. Each has parameters (max:,title-prefix:,labels:,allowed:,skip-if-match:).roles:/cache-memory:/timeout-minutes:/strict: true/imports:(for shared frontmatter snippets).
The markdown body is the natural-language prompt, interpolating Actions
context (${{ github.event.issue.number }}) and a sanitized event payload
(${{ steps.sanitized.outputs.text }} — @-mentions defanged, bot-trigger
phrases neutralised, length-capped).
gh aw compile produces a sibling <name>.lock.yml which is the actual
Actions workflow that runs. Lock files:
- Are committed and reviewable in PRs (auditable codegen).
- SHA-pin every action and container image (supply-chain hardening).
- Split the run into a read-only "agent job" and a separate "safe-outputs job" with narrowly-scoped writes.
- Inject the sanitization step, network jail (squid proxy or similar), MCP server containers, bot-loop signatures, and role checks.
Editing frontmatter requires recompiling; editing the markdown body does not.
- Default-deny writes. Main agent job runs
contents: readonly. - Safe outputs are a contract. Agent emits a JSON envelope; the safe-outputs job validates against schema (title prefixes, label allow-lists, max counts, body length) and then performs the write.
- Network jail. Outbound traffic restricted by allow-list, defaulting to a small "defaults" set.
- Tool jail.
bash:defaults to nothing; commands must be listed. - Role jail.
roles: [admin, maintain]short-circuits the run if the actor lacks the role. - Input sanitization. Untrusted markdown (issue body, PR description, comments) is HTML-escaped and stripped of
@mentions/ instructions likeIgnore previous. - Bot-loop prevention. Built in — comments authored by the workflow's own bot identity are filtered automatically.
- It does not provide a "session resume across runs" primitive comparable to our committed JSONL transcripts.
cache-memory:is per-workflow file-based memory, not multi-turn chat history. - It does not auto-vendor upstream prompts. The
gh awCLI hasgh aw install <pkg>for workflows, not for arbitrary prompt files like ours. - It does not publish GitHub Pages (
run-gitpagesis out of scope — that's a normal Actions job). - It does not allow truly arbitrary multi-provider key passing on a single workflow; one
engine:per workflow (thoughcustom:provides escape hatches). - It does not ship a "single drop-in YAML to copy" deployment story; users
gh extension install githubnext/gh-awand thengh aw addworkflows.
Disposition: deleted, replaced by N markdown workflow files.
The current ~955-line YAML covers:
- 8 trigger classes (issues, issue_comment, pull_request, push, workflow_dispatch, schedule×2, release, deployment_status)
- 4 jobs (install, refresh, agent, gitpages)
- inline shell for: actor authorization, reactions lifecycle, semver upgrade check, file-by-file overlay, .gitignore/.gitattributes maintenance, Playwright conditional install, push retry
In gh-aw:
- Each skill becomes its own
.mdin.github/workflows/(or skills sharing a single trigger can share a file).review.md→on.pull_request.types: [opened, synchronize]cso.md→on.pull_request:+labels: [security-audit]qa.md→on.command: { name: qa }retro.md→on.schedule: "weekly on friday at 17:00 UTC"document-release.md→on.release.types: [published]- …and so on for the other 21.
- The router YAML guards (
if: github.actor != 'github-actions[bot]' && …) → replaced byroles:+ automatic bot-loop filtering. - Concurrency stanza per workflow (gh-aw supports the same
concurrency:block in frontmatter; the elaborate expression we use can be expressed per-workflow). - The 🚀 / 👍 / 👎 reaction lifecycle — partially lost. gh-aw shows status via the Actions UI and adds bot reactions on the triggering comment, but it is not exposing the granular 3-stage reaction handshake. Acceptable.
- The push-retry loop — gone. gh-aw writes go through the safe-outputs job which does not race against itself the same way.
- The Pages job — keep as a separate hand-written
pages.yml(gh-aw doesn't replace this).
Disposition: deleted.
372 lines of routing logic that translates (event, comment, labels, config) to a skill name disappears, because gh-aw lets each workflow file declare its own trigger. The parseSlashCommand function and VALID_COMMANDS set are replaced by on.command: per workflow.
The one piece worth porting: the label-to-skill mapping (investigate, office-hours, design-consultation) — these become on.issues.types: [labeled] + labels: filters in the respective workflow files.
config.json per-skill enable/disable becomes "delete the workflow file" or "rename .md.disabled". The agentics-maintenance.yml workflow that gh-aw ships supports disable / enable as workflow_dispatch operations, restoring this capability at the meta level.
Disposition: deleted, except for two pieces.
917 lines doing: gh event fetch → session resolve → prompt build → spawn pi → parse JSONL → comment post → git commit/push → reaction swap. gh-aw subsumes all of it through the compile pipeline (engine adapter + safe-outputs envelope + sanitization).
What we lose without a direct replacement:
- Multi-turn session resume across runs. No
--session <path>equivalent. We have to either:- (a) accept stateless skills (gh-aw's normal mode — agents see the full thread on each run because comments are passed as context),
- (b) implement memory via
cache-memory:(a workflow-scoped file in a cache key — sufficient for/learn,/health,/benchmarktracking), or - (c) keep our own state directory and reference it from the markdown body. This works but is awkward — writing to it requires
safe-outputs.upload-artifactrather thangit commit.
What is worth porting into a tiny lifecycle/helpers/ shared by the markdown bodies:
extractDiffStat()style helpers — though gh-aw'stools.github.toolsets: [pull_requests]exposes diff data directly.- The 60 000-character comment cap — gh-aw's
safe-outputs.add-commentenforces a similar cap natively.
Disposition: deleted.
gh-aw supports Playwright as a first-class tool: tools.playwright: { allowed-domains: […], version: "1.x" }. The conditional install step in our YAML and the entire 434-line wrapper module are unnecessary. The 8 browser skills simply add playwright: to their tools: block.
Disposition: kept as-is, repurposed.
gh-aw has no upstream-prompt-vendoring story. The pipeline that pulls from garrytan/gstack, removes AskUserQuestion, swaps the browse daemon for Playwright, stamps <!-- GSTACK-INTELLIGENCE: GENERATED FILE -->, and writes verified outputs is our differentiator and should stay.
Only the output target changes: instead of writing to .github-gstack-intelligence/skills/<name>.md (consumed by our agent.ts), it writes to .github/workflows/<name>.md with the appropriate gh-aw frontmatter prepended. The post-write validator (Python step) should be updated to recognise gh-aw frontmatter keys.
Practically, refresh.ts will need a template-mapper: each upstream gstack <skill>.md is rendered into a gh-aw workflow file by combining the gstack body with a hand-curated frontmatter template for that skill (capturing its trigger, safe-outputs, tools, roles). That template set is the new long-lived asset.
Disposition: deleted with the source they test. Replaced by:
gh aw compileexit code as the "lint" for every workflow.gh aw compile --actionlint --zizmor --poutinefor security linting.gh aw statusandgh aw run --dry-runfor local smoke checks.
Disposition: largely deleted, with one carve-out.
defaults.provider/defaults.model→ moved into each workflow'sengine:frontmatter. (Loss: no central place to swap model.)defaults.maxCommentLength→ handled bysafe-outputs.add-comment.max-lengthper workflow.access.allowedPermissions→roles:per workflow.access.botLoopPrevention→ built-in.access.prefixGating/prefixes→on.command:per workflow.skills.<name>.enabled→ presence/absence of the workflow file (or useagentics-maintenance.yml's disable/enable).
Carve-out: central model selection is worth keeping. Implement as a tiny imports: snippet (.github/workflows/_shared/engine.md) that every workflow imports — change the model once, recompile all.
Disposition: deleted. Provider/model now lives in engine: frontmatter. The corresponding stored memory ("config.json.defaults is not used for model selection") becomes obsolete and should be downvoted post-migration.
Disposition: re-modelled.
state/sessions/*.jsonl+state/issues/<n>.json— gone. gh-aw does not "resume" the waypi --sessiondoes. The agent re-derives context from the issue thread on each invocation. For skills where multi-run state is genuinely needed (e.g./learnaccumulating lessons), usecache-memory:orsafe-outputs.push-to-pull-request-branchto write aLESSONS.md.state/memory.log(union-merge append-only) — port tocache-memory:with the same union-merge semantics, or convert to a committedMEMORY.mdupdated viasafe-outputs.create-pull-request..gitattributesunion-merge entry — keep, in case we retain aMEMORY.md.
Disposition: replaced by gh aw CLI workflow.
gh-aw's installation flow is: gh extension install githubnext/gh-aw then gh aw add <package>/<workflow>. To preserve our "copy one YAML file and the agent appears" ergonomic, we need a small bootstrap workflow that:
- Installs the
gh-awextension on the runner. - Runs
gh aw add japer-technology/github-gstack-intelligence/<skills>(assuming we publish ourselves as a gh-aw package — this is supported viagh aw packageper upstream docs). - Commits the generated
.md+.lock.ymlfiles.
This is shorter than the current 250-line install job but changes the UX: the user must enable the gh-aw extension (or rely on actions/setup-gh + gh extension install). The "single secret + one file = working agent" promise weakens slightly; in exchange, every future upgrade is gh aw upgrade instead of a custom semver diff loop.
Disposition: keep as a standalone classic .github/workflows/pages.yml. Out of scope for gh-aw.
Disposition: replaced.
- The
gh api collaborators/<actor>/permissionshell →roles: [admin, maintain, write]in frontmatter. - The 🚀 → 👍/👎 reaction lifecycle → gh-aw posts a status comment / reaction automatically. We lose the exact 3-state UX but gain it from the platform.
- The 👎-on-rejection unauthorized handling is built in.
The current workflow comment block calls out that no rate limiting / model tiering is implemented (v1.0.5). gh-aw provides:
on.stop-after: 30dbudgets per workflow.timeout-minutes:per workflow.safe-outputs.<type>.max:to bound write fan-out.cache-memory:to dedupe (skip if we've already commented on this PR head SHA).- Native concurrency groups (we have these too).
These directly address an acknowledged gap in our current README.
| Skill | New trigger (on:) |
safe-outputs |
tools notes |
|---|---|---|---|
review |
pull_request.types: [opened, synchronize] |
add-comment, create-pull-request-review-comment |
github: { toolsets: [default, pull_requests] }, bash: [diff, grep, jq] |
cso |
pull_request + labels: [security-audit] |
add-comment, create-issue (for follow-ups) |
same as review + web-fetch for CVE lookups |
design-review |
pull_request + labels: [design-review] |
add-comment |
playwright |
qa |
on.command: qa |
add-comment, upload-artifact (screenshots) |
playwright |
qa-only |
on.command: qa-only |
add-comment |
playwright |
investigate |
issues.types: [labeled] + labels: [investigate] |
add-comment |
github, web-fetch |
office-hours |
issues.types: [labeled] + labels: [office-hours] |
add-comment |
github |
design-consultation |
same with [design-consultation] |
add-comment |
github |
ship |
on.command: ship |
add-comment, create-pull-request |
github, edit |
autoplan |
on.command: autoplan |
add-comment, create-issue (per stage) |
chained — see §6.3 |
plan-{ceo,eng,design,devex}-review |
on.command: plan-*-review |
add-comment |
github |
retro |
on.schedule: weekly on friday |
create-discussion |
github |
benchmark |
on.schedule: daily |
create-issue (regression only, skip-if-match) |
github, bash: [time, jq] |
document-release |
on.release.types: [published] |
create-pull-request (docs update) |
edit |
canary |
on.deployment_status (state: success) |
add-comment, create-issue (alert) |
playwright |
careful |
on.command: careful |
add-comment |
github |
design-html |
on.command: design-html |
upload-artifact, add-comment |
playwright, edit |
design-shotgun |
on.command: design-shotgun |
upload-artifact, add-comment |
playwright |
devex-review |
on.command: devex-review |
add-comment |
playwright |
guard |
on.command: guard |
add-comment, add-labels |
github |
health |
on.command: health |
add-comment |
github |
land-and-deploy |
on.command: land-and-deploy |
add-comment, create-pull-request-review |
playwright, github |
learn |
on.command: learn |
create-pull-request (LESSONS.md) |
edit, cache-memory |
26 skills → 26 markdown files. Boilerplate can be reduced via imports: of a shared _engine.md and _security.md.
This is the biggest semantic loss. Today, asking the same issue a follow-up
question resumes the pi session and the agent recalls the prior exchange in
detail. In gh-aw, the agent gets ${{ steps.sanitized.outputs.text }} for the
new comment plus whatever the prompt body asks it to fetch (github.issues.list-comments).
Mitigations:
- Always pass the issue thread in the prompt body via the
githubtool. - For skills where reasoning state matters (
/learn,/autoplan), persist a structured artefact (STATE.json) viasafe-outputs.upload-artifactorcreate-pull-request, and read it on the next run. - Use
cache-memory:with the issue number as the cache key for skills where a transient hash-keyed cache is good enough.
This is a real downgrade for chatty skills (/office-hours,
/design-consultation). Document it as a known regression in any migration.
We currently pass seven LLM provider keys and let pi pick. gh-aw enforces
one engine: per workflow. To preserve user choice:
- Decide on a default at install (publish two flavours:
gstack-openaiandgstack-claude). - Or use
engine: custom:with a shell script that picks based on which secret is present — supported, but defeats the purpose ofgh-aw's validation. - Or accept the trade and pick one engine. Cleanest, and aligned with gh-aw's philosophy.
Today, /autoplan runs 4 skills sequentially in one pi invocation. In gh-aw:
- Option A: one workflow with a long markdown body that runs all 4 prompts back-to-back in a single agent turn. Simple, but loses the per-stage review artefacts.
- Option B: chained workflows —
autoplan.mdemitssafe-outputs.create-issuewith labelplan-ceo, which triggersplan-ceo-review.md, etc. This is the idiomatic gh-aw pattern (workflow dispatch via labels).
Option B preserves auditability and matches the platform's grain.
Native safe-outputs.add-comment.max-length (or similar) covers this. Skills
that exceed it today should be re-engineered to emit a short comment +
upload-artifact for the long form. This is a good forced refactor.
Our AGENT_SIGNATURE HTML comment + router check → built-in to gh-aw.
Verify that gh-aw's filter triggers on the same bot identity our workflow runs as
(github-actions[bot]); if we change identity (e.g. PAT), we must reconfirm.
Worth preserving as-is. Only refresh.ts's output stage changes: it must
now emit gh-aw-flavoured markdown (with frontmatter), not raw skill bodies.
Concretely: maintain .github-gstack-intelligence/templates/<skill>.frontmatter.yml
files. refresh.ts reads the upstream garrytan/gstack/<skill>.md, applies the
existing sanitisations (drop AskUserQuestion, swap browse daemon → Playwright),
concatenates templates/<skill>.frontmatter.yml + body, and writes
.github/workflows/<skill>.md. Then runs gh aw compile to regenerate
.lock.yml files, and commits both.
Becomes gh aw upgrade (provided by the upstream agentics-maintenance.yml).
Our 3-category file overlay (ALWAYS / NEVER / DEFAULT) is no longer needed
because (a) config.json, .pi/settings.json, AGENTS.md, state/ no longer
exist, and (b) workflow files are themselves the unit of versioning — users
edit the markdown body, the lock file is regenerated.
AGENTS.md (the agent identity) lives on as a top-level AGENTS.md referenced
from each workflow body via imports: [../AGENTS.md].
Untouched. Move run-gitpages into its own pages.yml classic workflow.
.github-gstack-intelligence/help/README.md(slash command source-of-truth per stored memory) — keep, but mark each command's "implementation" as.github/workflows/<skill>.mdinstead ofskills/<skill>.md.README.md,PACKAGES.md,ETHOS.md— minor edits.- Comment-block disclaimers in
.github/workflows/github-gstack-intelligence-agent.yml— re-home into a top-levelINSTALL.md(the YAML file is gone).
Three viable paths, in increasing order of fidelity:
- Leave the existing workflow untouched.
- Pick one well-bounded skill (
/reviewis ideal — single trigger, single output, no session state) and re-implement it as.github/workflows/review-gh-aw.md. - In
config.json, setskills.review.enabled: falseso the old path stops handling PR reviews. - Measure: latency, cost, quality, false-positive rate vs the old
/review. - Decide whether to continue.
Pros: zero blast radius, fast feedback. Cons: doubles operational surface during the pilot.
- Pilot
/reviewas above. - Add gh-aw workflows for the remaining 25 skills in priority order (pull_request → command → schedule → label → release/deployment).
- As each one ships, flip
config.json.skills.<name>.enabledtofalsein the old router so the old agent no-ops. - When all 26 are migrated, delete:
.github/workflows/github-gstack-intelligence-agent.yml.github-gstack-intelligence/lifecycle/{agent,router,browser}*.ts.github-gstack-intelligence/config.json(or shrink to just the refresh metadata).github-gstack-intelligence/.pi/.github-gstack-intelligence/state/(with a one-time export to artefacts for any chatty skills that need history)
- Repurpose
refresh.tsto emit gh-aw workflow files (Section 6.6). - Add the gh-aw
agentics-maintenance.ymlfordisable/enable/upgradeops. - Add a small bootstrap workflow (Section 4.10) for installation parity.
Pros: every step is shippable; no big-bang. Cons: long; for several months users see two systems.
- Branch the repo. Delete
agent.ts,router.ts,browser.ts,config.json,state/,.pi/, the big YAML. - Generate 26 workflow
.mdfiles from a template + the existing skill bodies. - Generate hand-curated frontmatter per skill (the table in §5 is the spec).
- Update
refresh.tsoutput target. - Ship as v2.0.0 — explicitly a breaking change.
Pros: smallest end-state, cleanest mental model, fastest end-to-end. Cons: existing installations need a clean re-install; the chatty-skill regression hits everyone at once; less opportunity to learn before committing.
Recommendation: Path B, with the explicit caveat that the multi-turn session
loss (§6.1) be designed-for from the start — i.e. agree which skills are
"stateless OK" and which need a cache-memory: or artefact-based memory
shim — so we never ship a skill that quietly regresses on session continuity.
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Multi-turn skills regress on continuity | High | Medium-High | Audit per-skill in advance; use cache-memory: / artefact persistence for skills that need it; document the change. |
| Provider lock-in (one engine per workflow) | Certain | Medium | Pick one engine; publish parallel flavours if needed. |
| Loss of "single drop-in YAML" install UX | Certain | Low-Medium | Provide a bootstrap workflow + clear gh extension install docs. |
garrytan/gstack refresh requires per-skill frontmatter templates we now own |
Certain | Low | One-time cost; templates are short. |
| gh-aw is pre-1.0 and changing | High | Medium | Pin a gh-aw version; track upstream; the lock files insulate us from CLI changes. |
| Lock files inflate PRs | Certain | Low | Reviewers learn to ignore; can be gated by codeowners. |
| Lose the JSONL-in-git audit trail | Certain | Medium | gh-aw provides Actions logs; if stronger audit is needed, use safe-outputs.upload-artifact to dump per-run JSON. |
| Existing users on v1.0.5 face a breaking upgrade | Certain | Medium | Bump to v2.0.0; ship a migration script that runs once via workflow_dispatch. |
| Reaction lifecycle (🚀/👍/👎) UX regression | Likely | Low | Document; users see equivalent feedback via Actions UI + bot reactions. |
| The 4-hour timeout disappearance (gh-aw default 20m) | Likely | Low-Medium | Set timeout-minutes: explicitly per skill; revisit /autoplan's chained model. |
Security posture change (read-only by default) breaks a skill that secretly relied on contents: write |
Medium | Medium | Manual audit per skill; safe-outputs typically covers the legitimate cases. |
- Massively smaller code surface. ~2 700 lines of TS + YAML → ~1 200 lines of markdown frontmatter + 1 small
refresh.ts. Less to maintain, less to bug-fix, less to upgrade. - Read-only-by-default security model. Eliminates a class of supply-chain risks (an LLM hallucinating a
rm -rfno longer has the permissions to act on it from the main job). - Input sanitization for free. Today, our prompts include raw issue bodies — a textbook prompt-injection vector. gh-aw's
${{ steps.sanitized.outputs.text }}fixes this across all 26 skills in one move. - Network egress allow-list. Currently unconstrained; one of the easier wins.
- Tool allow-lists.
bash: [cat, grep, jq]instead of "whatever pi exposes". - MCP server first-class support. If/when we want to wire in GitHub MCP, Sentry MCP, etc., we declare them and gh-aw handles auth + sandboxing.
- Compile-time validation.
gh aw compile --actionlint --zizmor --poutinecatches misconfigurations before they reach production. - Reviewable codegen. Lock files in PRs show exactly what changed.
agentics-maintenance.ymlfor free. Disable/enable/upgrade/replay capabilities we currently lack.- Reduced cost-control gap.
on.stop-after:budgets,safe-outputs.<x>.max:caps, andcache-memory:dedup all address acknowledged gaps in our v1.0.5 README.
- Multi-turn session resume via
pi --session— the single biggest behavioural regression. - Multi-provider key passthrough in one workflow.
- Single-file drop-in install ergonomic.
- Git-committed JSONL transcripts for auditing.
- 3-state reaction handshake UX.
- Tight control over the agent loop in TS — gh-aw is more opinionated; if we ever need behaviour outside its model, we revert to
engine: custom:(escape hatch but not pleasant). - The 4-hour timeout for genuinely long pipelines — must be re-justified per workflow.
Proceed with Path B (skill-by-skill strangler), starting with /review.
Concretely, the first PR after agreeing this PIVOT should:
- Add
.github/workflows/_shared/engine.md(shared frontmatter snippet). - Add
.github/workflows/review.md(gh-aw version of/review, importing the shared engine). - Flip
config.json.skills.review.enabledtofalse. - Add a short note to
.github-gstack-intelligence/help/README.mdexplaining that/reviewis now served by gh-aw. - Add a CI step to run
gh aw compileandgh aw compile --actionlint --zizmor --poutineon PRs.
Each subsequent PR migrates one skill. The final PR deletes the old workflow,
agent.ts, router.ts, browser.ts, config.json, .pi/, and the bulk of
state/. v2.0.0 is cut at that point.
The garrytan/gstack refresh pipeline is preserved and re-targeted; it remains
the project's distinguishing asset.
- Engine choice. If we must pick one, is the default OpenAI (today's
default per
.pi/settings.jsonconventions) or Copilot (gh-aw's first-supported engine)? - Session continuity. Which skills are we comfortable making stateless?
/office-hoursand/design-consultationare the obvious risks. - Installer UX. Are we willing to ship a "two-step install" (extension +
gh aw add) in exchange for the security wins? Or do we want to invest in a one-click bootstrap workflow that does both? - Versioning. Cut v2.0.0 at end of Path B, or earlier as a
2.0.0-betawhen the first skill ships in gh-aw mode? refresh.tsownership. If we now own per-skill frontmatter templates, does that template directory live in this repo, or in a newjaper-technology/gh-aw-templatescompanion repo?- Memory log. Keep
MEMORY.md-as-PR pattern, or fully delegate tocache-memory:per workflow? - gh-aw version pin. Pre-1.0; do we vendor a
.gh-aw-versionfile and upgrade quarterly, or always-latest?
Answering these unblocks Path B's first PR.