Skip to content

feat(014): budget-guard skill — implement ADR-003 §G4#19

Open
wendeus0 wants to merge 2 commits into
mainfrom
feat/014-budget-guard
Open

feat(014): budget-guard skill — implement ADR-003 §G4#19
wendeus0 wants to merge 2 commits into
mainfrom
feat/014-budget-guard

Conversation

@wendeus0

@wendeus0 wendeus0 commented Apr 30, 2026

Copy link
Copy Markdown
Owner

User description

Summary

Resolve P3 do PENDING_LOG implementando ADR-003 §G4 (council 2026-04-27 adopted): cria a skill canônica budget-guard e restaura as 5 invocações nas skills meta.

Aspecto Decisão (ADR-003 + council)
Tipo Skill (não subagent) — falha critérios -er/-guard
Modelo recomendado claude-haiku-4-5-20251001
Frequência Gate por feature, não por invocação
Circuit breaker Auto-disable se >5% do orçamento estimado
Triggers high, multi-step, security, large-context (do triage)
Estado Stateless cross-feature; reseta a cada nova features/NNN/
Override budget_guard_model / budget_guard_threshold_pct em CLAUDE.md local

Mudanças

Arquivo Tipo
core/skills/budget-guard/SKILL.md novo — skill operacional canônica
core/skills/{fix-feature,implement-feature,refactor-feature,technical-triage,test-architecture-plan}/SKILL.md edição linha-única — restaura \budget-guard` via `Task``
docs/adr/ADR-003-ghost-skills-treatment.md G4 marcado A — implementado (era PENDING-SPEC); distribuição passa de 1A+2C+1B+1P para 2A+2C+1B
PENDING_LOG.md P3 entry marcada DONE
memory/next_steps.md nota 2 marcada concluída
features/014-budget-guard/{SPEC,PLAN,TASKS,CONTRACT,REPORT}.md artefatos da feature

Validação

  • bash scripts/validate-spec.sh --feature 014-budget-guard → 0 findings
  • python3 features/009-skill-harmonization/scripts/audit.py (full) → 0 findings
  • audit.py --detector REFS → 0 (token deixa de ser PHANTOM)
  • audit.py --detector SKILL_FORMAT → 0
  • audit.py --detector CONTRACT → 0 (gates BUDGET_* declarados formalmente)
  • bash tests/smoke/all.sh → 9/9 verdes
  • bash tools/sync-skills.sh --check → drift 0 (18 skills propagadas para 3 mirrors)
  • grep -l 'budget-guard\ via `Task' core/skills/{...}/SKILL.md` → 5/5

Decisões de execução

  • Sem test-design — PLAN.md confirma 0 condições binárias (skill é definição de contrato textual). Camada unitária via detectores existentes.
  • code-review SKIP — diff doc-only (1 SKILL nova + 5 edições linha-única + 3 docs). Sem lógica nova testável.
  • security-review SKIP — superfície de ataque zero (apenas markdown).

Test plan

  • Bots de review (codeant-ai, coderabbitai, cubic-dev-ai) sem regressões
  • CI workflows passam

🤖 Generated with Claude Code


CodeAnt-AI Description

Add the canonical budget-guard skill and restore its use in meta skills

What Changed

  • Added a new budget-guard skill that tells meta skills when to keep, lower, or stop using a higher-cost model based on the current feature’s budget
  • Restored budget-guard calls in the five meta skills that use cost control so they invoke the skill again instead of only mentioning the ADR
  • Updated the ADR and pending work log to mark this gate as implemented and no longer pending
  • Kept the audit clean by registering the new skill and updating the audit rules for the new model and trigger names

Impact

✅ Clearer cost-gating for large or risky tasks
✅ Fewer missed budget checks in feature workflows
✅ No more phantom references for budget-guard

🔄 Retrigger CodeAnt AI Review

Details

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Resolves PENDING_LOG P3: Implementar feature dedicada budget-guard-spec.

Changes:
- core/skills/budget-guard/SKILL.md: new operational skill canonical
  (description 176 chars, 4 triggers, gate per feature, circuit breaker
  >5%, Haiku default, override via CLAUDE.md local, stateless cross-feature)
  - 3 domain gates: BUDGET_OK, BUDGET_WARN, BUDGET_BLOCKED declared
    formally in `# Estados de saída`
- 5 meta skills restored Task invocation:
  - fix-feature, implement-feature, refactor-feature, technical-triage,
    test-architecture-plan: prose neutra → `budget-guard` via `Task`
- ADR-003: G4 marked active (was PENDING-SPEC); distribution 1A → 2A
- PENDING_LOG: P3 entry marked DONE
- memory/next_steps.md: line 2 marked done

Validation:
- audit.py (full): 0 findings — REFS no longer reports PHANTOM
- validate-spec --feature 014-budget-guard: 0 findings
- smoke 9/9
- sync-skills drift 0 (18 skills propagated to 3 mirrors)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codeant-ai

codeant-ai Bot commented Apr 30, 2026

Copy link
Copy Markdown

CodeAnt AI is reviewing your PR.

@coderabbitai

coderabbitai Bot commented Apr 30, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR implements the G4 "budget-guard" feature: adds a new canonical skill that emits budget gates and model recommendations, updates five meta-skills to invoke it via Task instead of ADR-003 prose, and marks the pending feature as DONE with ADR-003 §G4 activated and zero audit findings.

Changes

Cohort / File(s) Summary
Budget-guard Skill
core/skills/budget-guard/SKILL.md
New skill spec emitting `BUDGET_OK
Orchestration Updates
core/skills/fix-feature/SKILL.md, core/skills/implement-feature/SKILL.md, core/skills/refactor-feature/SKILL.md, core/skills/technical-triage/SKILL.md, core/skills/test-architecture-plan/SKILL.md
Replace ADR-003 prose with explicit budget-guard invocation via Task when triage indicates high/multi-step/security/large context.
Feature Artifacts
features/014-budget-guard/CONTRACT.md, features/014-budget-guard/PLAN.md, features/014-budget-guard/REPORT.md, features/014-budget-guard/SPEC.md, features/014-budget-guard/TASKS.md
Add full feature contract, plan, spec, tasks, and report describing goals, validation steps, test strategy, and orchestration wiring.
Status & ADR Docs
PENDING_LOG.md, docs/adr/ADR-003-ghost-skills-treatment.md, memory/next_steps.md
Mark budget-guard-spec/feature 014 as DONE (2026-04-30), update ADR-003 §G4 from PENDING to active, record zero REFS findings.
Audit Helpers
features/009-skill-harmonization/scripts/audit.py
Add entries to EXTERNAL_TOOLS classification used by audits/detectors.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Developer
  participant Orchestrator
  participant BudgetGuard as "budget-guard (Skill)"
  participant ModelSelector
  participant Runtime

  Developer->>Orchestrator: submit task with triage_class, feature_scope, current_model, running_cost_pct?
  Orchestrator->>BudgetGuard: TaskInvoke(inputs)
  BudgetGuard-->>Orchestrator: BUDGET_OK / BUDGET_WARN / BUDGET_BLOCKED + model rec
  Orchestrator->>ModelSelector: choose worker/model based on gate & triage
  ModelSelector->>Runtime: execute using selected model/worker
  Runtime-->>Orchestrator: result / telemetry (or telemetry unavailable)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

size:XL

Poem

🐰 A little guard hops on the scene,
Five skills now call it, crisp and keen,
Haiku whispers when budgets lean,
Block, warn, or pass — tidy and clean. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the primary change: implementing the budget-guard skill as specified in ADR-003 §G4. It is concise, specific, and clearly conveys the main objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/014-budget-guard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codeant-ai codeant-ai Bot added the size:L This PR changes 100-499 lines, ignoring generated files label Apr 30, 2026
@codeant-ai

codeant-ai Bot commented Apr 30, 2026

Copy link
Copy Markdown

CodeAnt AI finished reviewing your PR.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
core/skills/test-architecture-plan/SKILL.md (1)

66-68: ⚡ Quick win

Align trigger wording with the canonical triage classes.

Line 67 uses “projetos grandes”, while the feature contract standardizes triggers as high, multi-step, security, large-context. Using the same taxonomy here avoids ambiguity across meta-skills.

Suggested patch
-2. **Controle de custo:** projetos grandes — invoque a skill `budget-guard` via `Task` antes de iniciar o mapeamento completo.
+2. **Controle de custo:** se o `triage` indicar `high`, `multi-step`, `security` ou `large-context`, invoque a skill `budget-guard` via `Task` antes de iniciar o mapeamento completo.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/skills/test-architecture-plan/SKILL.md` around lines 66 - 68, The phrase
"projetos grandes" in the "Controle de custo" step is not using the canonical
trigger taxonomy; update that wording to use the standardized trigger token
`large-context` (and ensure the doc references the canonical trigger set:
`high`, `multi-step`, `security`, `large-context`) so the line reads something
like "Controle de custo: `large-context` — invoque a skill `budget-guard` via
`Task`..."; apply the same canonical trigger wording anywhere else in SKILL.md
that uses informal trigger phrases to avoid ambiguity across meta-skills.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/skills/budget-guard/SKILL.md`:
- Around line 55-57: The fenced output-format block in SKILL.md currently lacks
a language tag; update the triple-backtick fence that contains
"<BUDGET_OK|BUDGET_WARN|BUDGET_BLOCKED>: <recomendação curta com modelo
sugerido>" to include a language identifier (e.g., add "text" after the opening
```), so the block becomes ```text ... ``` to satisfy markdownlint MD040 (refer
to the fenced block showing the budget status tokens).
- Line 31: The doc has inconsistent threshold wording for the circuit breaker:
the phrase "Circuit breaker auto-disable em >5%" conflicts with the later lines
that say "5% and above"; pick one policy (either strictly greater than 5% or
greater-or-equal to 5%) and make the text consistent everywhere—update the
header phrase and any occurrences referring to the threshold and the behavior
that returns BUDGET_BLOCKED so they all use the same operator (e.g., change
">5%" to ">=5%" or vice versa) and ensure the description of required
override/flow control remains aligned with that chosen semantics.

In `@features/014-budget-guard/PLAN.md`:
- Around line 13-19: The fenced code block containing the diagram (starting with
implement-feature ──▶ Task(budget-guard) ──▶ skill avalia contexto ──▶ retorna
BUDGET_{OK,WARN,BLOCKED}) is missing a language identifier; add the language tag
`text` to the opening triple backticks so the block becomes ```text to satisfy
MD040 and keep linting green while preserving the diagram content.

In `@features/014-budget-guard/TASKS.md`:
- Around line 19-24: The fenced code block that contains the critical-path
diagram starting with "T1 (criar skill) ──▶ T2 (restaurar 5 invocações)" is
missing a language tag; add "text" after the opening triple backticks so the
block reads ```text and keep the block content unchanged to satisfy MD040
linting.

In `@memory/next_steps.md`:
- Line 8: Update the document header date after the new completion entry: locate
the "Atualizado: 2026-04-28" header in next_steps.md and change it to
"Atualizado: 2026-04-30" to match the entry "Concluído em 2026-04-30 via feature
`014-budget-guard`" so the header timestamp is consistent with the recorded
completion.

In `@PENDING_LOG.md`:
- Line 34: Update the DONE entry in PENDING_LOG.md to correct the output-state
count: change "5 estados de saída declarados" to "3 estados de saída declarados
(BUDGET_OK, BUDGET_WARN, BUDGET_BLOCKED)" to match the actual states declared in
core/skills/budget-guard/SKILL.md; ensure the rest of the line (feature name,
triggers, gate, circuit breaker, model, overrides, Task invocation, ADR-003,
auditor note) remains unchanged.

---

Nitpick comments:
In `@core/skills/test-architecture-plan/SKILL.md`:
- Around line 66-68: The phrase "projetos grandes" in the "Controle de custo"
step is not using the canonical trigger taxonomy; update that wording to use the
standardized trigger token `large-context` (and ensure the doc references the
canonical trigger set: `high`, `multi-step`, `security`, `large-context`) so the
line reads something like "Controle de custo: `large-context` — invoque a skill
`budget-guard` via `Task`..."; apply the same canonical trigger wording anywhere
else in SKILL.md that uses informal trigger phrases to avoid ambiguity across
meta-skills.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 06b4dd1a-7d7f-4b8b-a6e5-4414cd0668cd

📥 Commits

Reviewing files that changed from the base of the PR and between d92ea9a and 6d6af42.

📒 Files selected for processing (14)
  • PENDING_LOG.md
  • core/skills/budget-guard/SKILL.md
  • core/skills/fix-feature/SKILL.md
  • core/skills/implement-feature/SKILL.md
  • core/skills/refactor-feature/SKILL.md
  • core/skills/technical-triage/SKILL.md
  • core/skills/test-architecture-plan/SKILL.md
  • docs/adr/ADR-003-ghost-skills-treatment.md
  • features/014-budget-guard/CONTRACT.md
  • features/014-budget-guard/PLAN.md
  • features/014-budget-guard/REPORT.md
  • features/014-budget-guard/SPEC.md
  • features/014-budget-guard/TASKS.md
  • memory/next_steps.md

Comment thread core/skills/budget-guard/SKILL.md Outdated
Comment thread core/skills/budget-guard/SKILL.md Outdated
Comment thread features/014-budget-guard/PLAN.md Outdated
Comment thread features/014-budget-guard/TASKS.md Outdated
Comment thread memory/next_steps.md
Comment thread PENDING_LOG.md Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 14 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="core/skills/budget-guard/SKILL.md">

<violation number="1" location="core/skills/budget-guard/SKILL.md:3">
P2: Use a single breaker threshold definition (`>= 5%` or `> 5%`) across the whole skill; current mixed wording creates ambiguous behavior at exactly 5%.</violation>
</file>

<file name="features/014-budget-guard/TASKS.md">

<violation number="1" location="features/014-budget-guard/TASKS.md:29">
P2: O comando de validação de T2 usa o padrão com `\`` e não encontra as ocorrências reais, quebrando o critério de saída documentado.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread core/skills/budget-guard/SKILL.md Outdated
Comment thread features/014-budget-guard/TASKS.md Outdated
- Threshold uniforme ≥5% em SKILL.md (description, regra, processo)
  e em PENDING_LOG. Antes ambíguo: descrição dizia >5% mas processo
  e tabela usavam ≥5%.
- Adicionado language tag `text` aos 3 fenced blocks sem MD040
  (SKILL.md output format, PLAN.md diagrama, TASKS.md critical path).
- TASKS.md T2: comando grep simplificado para um loop com -qF e
  testado funcional (saída vazia confirma 5/5 invocações restauradas).
- memory/next_steps.md header sincronizado para 2026-04-30.
- PENDING_LOG: contagem corrigida para 3 estados de saída (BUDGET_OK/
  WARN/BLOCKED), não 5 como estava.
- audit.py EXTERNAL_TOOLS: adicionados model IDs Anthropic (claude-
  haiku-4-5, sonnet-4-5/4-6, opus-4-6/4-7) e classes textuais (multi-
  step, large-context) para evitar PHANTOM REFS em refs literais.

Validation:
- audit.py (full): 0 findings
- smoke 9/9
- sync-skills drift 0

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/skills/budget-guard/SKILL.md`:
- Around line 31-36: The documentation currently leaves `BUDGET_BLOCKED`
ambiguous between an informational flag and a mandatory stop; update SKILL.md to
state that `BUDGET_BLOCKED` is an enforceable circuit-breaker which requires the
runtime invoker to abort execution unless an explicit override is present, and
describe the override mechanism (`budget_guard_model` and
`budget_guard_threshold_pct` in CLAUDE.md) and how to record an explicit
override (e.g., runtime must set an explicit "override" token or reduce scope)
before continuing; keep the recommended default model `claude-haiku-4-5`, the 5%
threshold language, and note that the skill still remains stateless but the
invoker must enforce the abort behavior when `BUDGET_BLOCKED` is returned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 78dd685b-97cd-4c81-835f-b58988dc486f

📥 Commits

Reviewing files that changed from the base of the PR and between 6d6af42 and 104e156.

📒 Files selected for processing (6)
  • PENDING_LOG.md
  • core/skills/budget-guard/SKILL.md
  • features/009-skill-harmonization/scripts/audit.py
  • features/014-budget-guard/PLAN.md
  • features/014-budget-guard/TASKS.md
  • memory/next_steps.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • PENDING_LOG.md

Comment on lines +31 to +36
- **Circuit breaker auto-disable em ≥5%** — se o custo acumulado atingir ou superar 5% do orçamento estimado da feature, retorne `BUDGET_BLOCKED`; runtime invocador exige override explícito ou redução de escopo antes de prosseguir.
- **Modelo recomendado default: Haiku (`claude-haiku-4-5`)** — função classificatória/heurística simples; não exige premium.
- **Override por projeto** registrável em `CLAUDE.md` local com `budget_guard_model: <model-id>` ou `budget_guard_threshold_pct: <N>` — projetos com perfis distintos podem afrouxar/apertar o gate.
- **Stateless cross-feature** — não persiste em arquivo; estado vive na sessão da feature ativa.
- **Veredito é informativo, não bloqueante automático** — `BUDGET_BLOCKED` exige que o runtime invocador respeite, mas a skill em si não interrompe execução.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

BUDGET_BLOCKED ficou sem obrigação operacional explícita para o invocador.

Line 31 descreve auto-disable, mas Line 35 permite interpretação “apenas informativa”. Isso enfraquece o gate e pode deixar o bloqueio opcional nas skills consumidoras.

🔒 Patch sugerido para fechar a ambiguidade
-- **Veredito é informativo, não bloqueante automático** — `BUDGET_BLOCKED` exige que o runtime invocador respeite, mas a skill em si não interrompe execução.
+- **Veredito é gate obrigatório para o invocador** — ao receber `BUDGET_BLOCKED`, o runtime deve interromper escalonamento premium e só prosseguir com override explícito (ou redução de escopo) registrado.
🧰 Tools
🪛 LanguageTool

[misspelling] ~31-~31: Esta é uma palavra só.
Context: ...pct` cruzar limiar. - Circuit breaker auto-disable em ≥5% — se o custo acumulado atingir...

(AUTO)


[misspelling] ~33-~33: Possível erro ortográfico.
Context: ...** registrável em CLAUDE.md local com budget_guard_model: <model-id> ou budget_guard_threshold_pct: <N> ...

(PT_MULTITOKEN_SPELLING_HYPHEN)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/skills/budget-guard/SKILL.md` around lines 31 - 36, The documentation
currently leaves `BUDGET_BLOCKED` ambiguous between an informational flag and a
mandatory stop; update SKILL.md to state that `BUDGET_BLOCKED` is an enforceable
circuit-breaker which requires the runtime invoker to abort execution unless an
explicit override is present, and describe the override mechanism
(`budget_guard_model` and `budget_guard_threshold_pct` in CLAUDE.md) and how to
record an explicit override (e.g., runtime must set an explicit "override" token
or reduce scope) before continuing; keep the recommended default model
`claude-haiku-4-5`, the 5% threshold language, and note that the skill still
remains stateless but the invoker must enforce the abort behavior when
`BUDGET_BLOCKED` is returned.

@codeant-ai

codeant-ai Bot commented May 12, 2026

Copy link
Copy Markdown

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added size:L This PR changes 100-499 lines, ignoring generated files and removed size:L This PR changes 100-499 lines, ignoring generated files labels May 12, 2026
@codeant-ai

codeant-ai Bot commented May 12, 2026

Copy link
Copy Markdown

Sequence Diagram

This PR introduces the budget-guard skill and reconnects five meta skills so they route high-impact work through a per-feature cost gate before choosing an LLM model.

sequenceDiagram
    participant User
    participant Meta as Meta skills
    participant Triage as Triage skill
    participant Task as Task runtime
    participant BudgetGuard as Budget guard skill

    User->>Meta: Request complex feature work
    Meta->>Triage: Run triage to classify impact and triggers
    Triage-->>Meta: Return triage class and triggers
    Meta->>Task: Invoke budget guard with scope, model, running cost
    Task->>BudgetGuard: Evaluate cost gate for this feature
    BudgetGuard-->>Task: Return budget state and model recommendation
    Task-->>Meta: Deliver BUDGET_OK or WARN or BLOCKED decision
    Meta-->>User: Continue with chosen model or require override based on gate
Loading

Generated by CodeAnt AI

@codeant-ai

codeant-ai Bot commented May 12, 2026

Copy link
Copy Markdown

CodeAnt AI finished running the review.

@codeant-ai

codeant-ai Bot commented May 12, 2026

Copy link
Copy Markdown

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added size:L This PR changes 100-499 lines, ignoring generated files and removed size:L This PR changes 100-499 lines, ignoring generated files labels May 12, 2026
@codeant-ai

codeant-ai Bot commented May 12, 2026

Copy link
Copy Markdown

Sequence Diagram

This PR introduces the budget-guard skill and wires five meta skills to invoke it via Task when triage flags high cost risk, so it returns a budget verdict that guides model choice and can block premium usage once a cost threshold is reached.

sequenceDiagram
    participant MetaSkill
    participant Triage
    participant Runtime
    participant BudgetGuard

    MetaSkill->>Triage: Classify task risk and scope
    Triage-->>MetaSkill: triage_class and context
    MetaSkill->>Runtime: Task call to budget-guard with cost data
    Runtime->>BudgetGuard: Invoke skill with triage_class and running_cost_pct
    BudgetGuard->>BudgetGuard: Compute budget verdict
    BudgetGuard-->>Runtime: BUDGET_OK or WARN or BLOCKED with model advice
    Runtime-->>MetaSkill: Return budget verdict

    alt Budget ok or warn
        MetaSkill->>Runtime: Proceed with recommended model
    else Budget blocked
        MetaSkill->>Runtime: Require override or reduce scope before premium
    end
Loading

Generated by CodeAnt AI

Comment on lines +49 to +50
"claude-haiku-4-5", "claude-sonnet-4-5", "claude-sonnet-4-6",
"claude-opus-4-6", "claude-opus-4-7",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Model identifiers were added to EXTERNAL_TOOLS, but they are not executable tools. Because classify() treats EXTERNAL as valid in via Task/invoque checks, a mistaken invocation like claude-opus-4-7 via Task will be silently accepted instead of flagged as PHANTOM. Keep model IDs out of EXTERNAL_TOOLS and handle them in a separate non-invocation whitelist if you only want to suppress prose/backtick false positives. [api mismatch]

Severity Level: Major ⚠️
- ❌ REFS detector skips phantom model Task invocations in skills.
- ⚠️ Skill harmonization audit misses miswired model invocation references.
Steps of Reproduction ✅
1. Open `core/skills/budget-guard/SKILL.md` (lines 10–18 show the `# Quando usar` section
with triggers like `high`, `multi-step`, etc.) and add a new bullet such as `- invoke
\`claude-opus-4-7\` via \`Task\`` to simulate a mis-specified Task target in a real skill
document.

2. From the repository root, run `python3
features/009-skill-harmonization/scripts/audit.py --detector REFS`; this executes `run()`
in `features/009-skill-harmonization/scripts/audit.py:70-107`, which loads all skills,
constructs a `Catalog`, and calls `detect_refs()` for each skill (lines 368-392).

3. In `detect_refs()` (lines 368-399 of
`features/009-skill-harmonization/scripts/audit.py`), the `_VIA_TASK_RE` regex defined at
lines 339-340 matches the new `claude-opus-4-7 via Task` text, and the detector invokes
`kind, mapped = catalog.classify(token)` at line 380.

4. Because `EXTERNAL_TOOLS` includes `"claude-opus-4-7"` (lines 34-52, specifically
49-50), `Catalog.classify()` at lines 291-301 returns `("EXTERNAL", "claude-opus-4-7")`,
so the `if kind in ("SKILL", "SUBAGENT", "NATIVE", "EXTERNAL", "ALIAS", "MCP", "PLUGIN"):`
check in `detect_refs()` (line 399-400) treats this as a valid invocation and skips
emitting a `PHANTOM ... via Task` finding, even though `claude-opus-4-7` is a model
identifier, not a Task-invocable skill or tool, confirming that misclassified model IDs
bypass phantom-reference detection.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** features/009-skill-harmonization/scripts/audit.py
**Line:** 49:50
**Comment:**
	*Api Mismatch: Model identifiers were added to `EXTERNAL_TOOLS`, but they are not executable tools. Because `classify()` treats `EXTERNAL` as valid in `via Task`/`invoque` checks, a mistaken invocation like `claude-opus-4-7` via `Task` will be silently accepted instead of flagged as PHANTOM. Keep model IDs out of `EXTERNAL_TOOLS` and handle them in a separate non-invocation whitelist if you only want to suppress prose/backtick false positives.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

"spring-cloud-contract-stub-runner",
"claude-haiku-4-5", "claude-sonnet-4-5", "claude-sonnet-4-6",
"claude-opus-4-6", "claude-opus-4-7",
"multi-step", "large-context",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Trigger labels were added as external tools, which broadens the validator in the wrong dimension: multi-step/large-context will now be treated as legitimate callable tokens in chain and invocation detectors. This creates false negatives where malformed workflow steps are no longer reported. Keep these as contextual trigger terms (not tools), or whitelist them only in prose-specific checks. [logic error]

Severity Level: Major ⚠️
- ❌ CHAINS detector treats trigger labels as callable workflow steps.
- ⚠️ Workflow chain validation loses coverage for phantom tokens.
Steps of Reproduction ✅
1. In `core/skills/budget-guard/SKILL.md` (lines 37-59 define `# Processo`), add a
chain-style line such as `multi-step → budget-guard` or `[multi-step] → [budget-guard]` to
describe a workflow step using the `multi-step` trigger label in chain notation.

2. Run `python3 features/009-skill-harmonization/scripts/audit.py --detector CHAINS` from
the repository root; this calls `run()` in
`features/009-skill-harmonization/scripts/audit.py:70-107`, which in turn invokes
`detect_chains()` for each skill via the `PER_SKILL_DETECTORS` registry at lines 55-63.

3. Inside `detect_chains()` (originally around lines 415-45 of
`features/009-skill-harmonization/scripts/audit.py`), `_CHAIN_LINE_RE` defined at lines
~411-412 matches the new arrow line, the loop extracts `multi-step` as a token (lines
~23-31 in that function), and `kind, mapped = catalog.classify(token)` at line ~436 calls
`Catalog.classify()` with `"multi-step"`.

4. Because `EXTERNAL_TOOLS` includes `"multi-step"` (lines 34-52, specifically line 51),
`Catalog.classify()` (lines 291-301) returns `("EXTERNAL", "multi-step")`, so
`detect_chains()`'s `if kind in ("SKILL", "SUBAGENT", "ALIAS", "NATIVE", "EXTERNAL",
"MCP", "PLUGIN"):` check at line ~437 treats `multi-step` as a legitimate callable token
and skips emitting a `PHANTOM step` finding; this confirms that adding trigger labels like
`multi-step`/`large-context` to `EXTERNAL_TOOLS` suppresses CHAINS phantom-token
validation for malformed workflow steps.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** features/009-skill-harmonization/scripts/audit.py
**Line:** 51:51
**Comment:**
	*Logic Error: Trigger labels were added as external tools, which broadens the validator in the wrong dimension: `multi-step`/`large-context` will now be treated as legitimate callable tokens in chain and invocation detectors. This creates false negatives where malformed workflow steps are no longer reported. Keep these as contextual trigger terms (not tools), or whitelist them only in prose-specific checks.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@codeant-ai

codeant-ai Bot commented May 12, 2026

Copy link
Copy Markdown

CodeAnt AI finished running the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant