Stripe Startup Kit v0.1 — 5 skills + head-to-head eval (SOL-88)#21
Open
solidstatecc wants to merge 1 commit into
Open
Stripe Startup Kit v0.1 — 5 skills + head-to-head eval (SOL-88)#21solidstatecc wants to merge 1 commit into
solidstatecc wants to merge 1 commit into
Conversation
AUTHOR phase of the Skill Production pipeline. Five dogfood-critical-path skills that compose the Stripe MCP and enforce the four-part safety doctrine the MCP only recommends (test-mode-first, least-privilege rk_ keys, human-confirm on money, idempotent writes): stripe-stand-up · stripe-product-to-price · stripe-tax-ready stripe-deliver · stripe-revenue-read Each bundle is SKILL.md + entry.py (a deterministic JSON-in/JSON-out guard, pure stdlib, zero Stripe network calls) + a copy of the shared rails.py. The guard plans a confirm-gated, idempotent MCP call; the agent executes it via the Stripe MCP. Compose, never wrap. Evals (gating proof): head-to-head rails vs raw MCP scores 6/6 vs 1/6 (delta +5, idempotency deterministic) — the rails meaningfully beat raw MCP, so this is not a wrapper. Per-skill suite 22/22 (normal/edge/refusal). TEST MODE only. Listing/packaging and the real-money dogfood stay council-gated downstream. Next gates: AUDIT, then TEST. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stripe Startup Kit v0.1 — AUTHOR phase (SOL-88)
Five dogfood-critical-path skills for the founder's sell-a-thing loop. They compose the Stripe MCP and never wrap it, enforcing the four-part safety doctrine the MCP only recommends in prose.
The doctrine (the moat — enforced, not recommended)
rk_keys, scoped per skill;sk_flagged over-privilegedIdempotency-KeyArchitecture
Each bundle is
SKILL.md+entry.py+ a copy of the sharedrails.py.entry.pyis a deterministic guard (JSON-in → guarded plan out, pure stdlib, zero Stripe network calls). It returns the exact, idempotent, confirm-gated MCP call to make; the agent executes it through the Stripe MCP only when thegateclears (GO/CONFIRM/BLOCK). That deterministic enforcement layer is what makes the doctrine testable.The skills
stripe-stand-up→stripe-product-to-price→stripe-tax-ready→stripe-deliver→stripe-revenue-read(Deferred to v0.2:
subscription-designer,recover.)Evals — the gating proof
eval/head_to_head.py(doctrine rails vs raw MCP): rails 6/6, raw MCP 1/6, delta +5, idempotency deterministic → rails beat raw MCP, not a wrapper. Report ineval/RESULTS.md. The raw-MCP baseline is modeled faithfully from the SOL-86 RESEARCH gate (MCP runs live writes unblocked; doctrine is prose-only).eval/skills_eval.py(per-skill): 22/22 across normal / edge / out-of-scope refusal cases.Boundaries
TEST MODE only. Listing/packaging and the real-money dogfood are downstream council-gated steps, not in this PR. Next pipeline gates: AUDIT (auditor), then TEST (tester).
Note for AUDIT
Skills are prefixed
stripe-(brief used bare names likedeliver,stand-up). Rationale: a flat marketplace makes generic slugs collide and mis-trigger. Trivially reversible (folder name + frontmattername) if the council prefers the bare names.🤖 Generated with Claude Code