fix(agents,hands): per-agent/per-hand mcp_servers / skills allowlists (#87) by houko · Pull Request #92 · librefang/librefang-registry

houko · 2026-05-12T00:01:24Z

Fixes #87. Alternative to / supersedes #89 — see "Why not #89" below.

Summary

All 32 agent manifests and 17 hands ship with empty mcp_servers / skills lists, which the kernel interprets as "no filter" — every globally-configured MCP server's tools and every installed skill get injected into the prompt on every LLM call. On a typical instance (9 MCP servers ≈ 85 MCP tools + ~82 built-in tools) that's ~50k input tokens per turn spent on definitions the agent never uses.

What changed (51 files, +462/-8)

32 `agents/*/agent.toml`

mcp_servers — 1–4 per agent (e.g. coder → [memory, git, github, filesystem], devops-lead → [memory, git, github, filesystem], meeting-assistant → [memory, google-calendar, gmail], hello-world → [memory, fetch]).
skills — per-role allowlist driven by the system prompt (e.g. devops-lead → [docker, kubernetes, terraform, ansible, ci-cd, helm, prometheus, sysadmin]).
skills_disabled = true on the four pure-conversational agents (hello-world, recipe-assistant, health-tracker, home-automation). Their prompts never instruct the LLM to consult any skill, so loading all 60 was waste.
max_history_messages — tiered, all values at or above the rising kernel default (DEFAULT_MAX_HISTORY_MESSAGES = 60):
- inherit (no override) — the four conversational agents above; they get exactly the kernel default.
- 60 — single-turn task agents (writer, translator, doc-writer, email/customer/sales/recruiter, social-media, finance, tutor, travel, meeting, ops, devops-lead, planner). Explicit override at the default value to lock the cap if the kernel default moves again.
- 80 — multi-step / tool-heavy (coder, debugger, architect, code-reviewer, test-engineer, security-auditor, analyst, data-scientist, academic-researcher, researcher, legal).
- 120 — coordinators (assistant, orchestrator).

Pinning below the kernel default would thrash the prompt cache, the failure mode #91 fixed for the creator hand by raising the cap.

17 `hands/*/HAND.toml`

Every hand declares hand-level mcp_servers + skills (inherited by every [agents.*] inside).
clip + creator: skills_disabled = true placed on each [agents.*] sub-table, not at the hand top level. HandDefinitionRaw in librefang-hands doesn't carry a top-level skills_disabled field — a hand-level entry would be silently dropped by serde — so the setting must live on the AgentManifest of each sub-agent.
devteam extends its existing mcp_servers = ["github"] and replaces the placeholder skills = [] with the expected dev-team expertise.
wiki replaces placeholder mcp_servers = [] with [memory, fetch, filesystem].
lead: hand-level skills was originally [email-writer, writing-coach, interview-prep]; interview-prep is for job-interview prep, not lead generation. Replaced with data-analyst (used by the qualification-scoring step).

`schema.toml` + `agents/README.md`

Register mcp_servers / skills / max_history_messages on the agent field schema (so RegistrySchema in librefang-types sees them).
max_history_messages description points at librefang_runtime::agent_loop::DEFAULT_MAX_HISTORY_MESSAGES by name (60 today) so the schema doesn't go stale when the constant moves again.
README example block and "Adding a New Agent" checklist call out the allowlists; max_history_messages example is shown commented out with a prompt-cache caveat.

Open items / explicit trade-offs

assistant.skills = [] (no allowlist) — kept deliberately. assistant is the default generalist entry point; capping its skill surface at a small allowlist would defeat its "delegate to any specialist" job. The cost is that this single agent still pays the full skill-definition load on every turn. Operators who want a strict allowlist for assistant can override it after install. Open to a reviewer call here.

Estimated impact (per #87)

Before	After	Reduction
167 tool definitions / ~50k tokens per call	5–15 per agent / ~10–15k tokens per call	4–6×

Why not adopt #89

#89 covers similar ground but with three issues this PR avoids:

mcp_servers = ["_none"] sentinel — feat(agents,hands): per-agent skills, mcp_servers, profile, budget, and history limits #89's PR body explicitly notes it's pending upstream mcp_disabled (librefang#4808). This PR uses real allowlists.
max_history_messages = 8 / 12 / 15 / 20 — far below today's kernel default of 60 (raised from 40 on 2026-05-11) and fix(creator): raise max_history_messages + repair refresh-cache CI #91's direction for long-workflow hands (80–120). Every turn that hits the cap invalidates the cached prompt prefix; the cost of cache misses exceeds the saving from shorter history. This PR uses 60–120.
Doubling max_llm_tokens_per_hour widens the per-agent budget — the opposite direction from Agent manifests missing mcp_servers / skills / max_history_messages — 50k+ tokens wasted per call #87's "reduce per-call cost" goal. Left to the operator.

Verification

python scripts/validate.py → PASSED (32 agents, 17 hands, 33 mcp, 60 skills, 11 plugins, 267 models, 57 providers).
find . -name '*.toml' | xargs taplo fmt --check → clean.
Cross-reference: every mcp_servers entry resolves to an mcp/*.toml, every skills entry resolves to a skills/*/ directory (only pre-existing exception: agents.main.skills = ["wiki-librarian"] in hands/wiki/HAND.toml, untouched by this PR, references an out-of-tree skill).
Confirmed kernel field shapes against crates/librefang-types/src/agent.rs::AgentManifest (mcp_servers, skills, max_history_messages, skills_disabled — all top-level) and crates/librefang-hands/src/lib.rs::HandDefinitionRaw (mcp_servers, skills present; skills_disabled NOT present → must go on sub-agents).

Notes for reviewers

Allowlist values were derived from each agent's tools = […], shell = […], and the verbs/services its system prompt mentions. If your instance routinely needs an MCP server I didn't include for a given role, please call it out inline so we widen that one list rather than collapsing back to [].
Follow-up worth filing (not in this PR): scripts/validate.py does not cross-check mcp_servers / skills references against mcp/ and skills/ inventories. The interview-prep typo in lead and the wiki-librarian dangling reference would have been caught by a 20-line check.

chatgpt-codex-connector · 2026-05-12T00:01:31Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

…#87) All 32 agent manifests and 17 hands shipped with empty mcp_servers / skills lists, which the kernel interprets as "no filter" — every globally-configured MCP server's tools and every installed skill get injected into the prompt on every LLM call. On a typical instance (9 MCP servers, ~85 MCP tools + ~82 built-in tools) that's ~50k input tokens per turn spent on definitions the agent never uses. Changes ------- 32 agents/*/agent.toml: - mcp_servers: 1-4 per agent. memory wherever state persists across turns; fetch / exa-search / brave-search only where the prompt actually calls for web; git / github / filesystem on engineering agents; gmail / google-calendar / linear / jira on productivity agents whose prompts mention them. - skills: per-role allowlist driven by what the system_prompt names (e.g. coder → rust/python/typescript/git/shell-scripting; devops- lead → docker/kubernetes/terraform/ansible/ci-cd/helm/prometheus/ sysadmin). Generalists (assistant) keep skills = [] (see "Open items" below). - skills_disabled = true on the four short-conversational agents (hello-world, recipe-assistant, health-tracker, home-automation). Their system prompts never instruct the LLM to consult any skill, so loading all 60 was pure waste. They also drop the explicit max_history_messages override and inherit the kernel default (60). - max_history_messages tiered by workload shape: 60 short conversational (hello-world, recipe, health-tracker, home-automation) — inherits the rising kernel default (`DEFAULT_MAX_HISTORY_MESSAGES = 60`); no override needed. 60 single-turn task agents (writer, translator, doc-writer, email-assistant, customer-support, sales-assistant, recruit- er, social-media, personal-finance, tutor, travel-planner, meeting-assistant, ops, devops-lead, planner) — explicit override at the same value to lock the cap if the kernel default moves again. 80 multi-step / tool-heavy (coder, debugger, architect, code- reviewer, test-engineer, security-auditor, analyst, data- scientist, academic-researcher, researcher, legal-assistant) 120 coordinators (assistant, orchestrator) — long multi-agent sessions where prompt-cache continuity is critical All values sit at or above the kernel default. Pinning lower would thrash the prompt cache (the failure mode #91 fixed for the creator hand by *raising* the cap, not lowering it). 17 hands/*/HAND.toml: - hand-level mcp_servers / skills now declared on every hand, so every [agents.*] inside inherits a sensible allowlist. - skills_disabled = true placed on each [agents.*] inside clip and creator (pure media pipelines that don't benefit from any skill). HandDefinitionRaw in librefang-hands does NOT have a top-level skills_disabled field — declaring it at the hand top level would be silently dropped by serde, so the setting must live on the AgentManifest of each sub-agent role. - devteam: expand existing mcp_servers = ["github"] to include memory / git / filesystem; populate skills with the expected dev-team expertise (replacing the placeholder skills = []). - wiki: replace placeholder mcp_servers = [] with [memory, fetch, filesystem]. Hand-level skills stays []. - lead: hand-level skills was originally [email-writer, writing- coach, interview-prep]; interview-prep is for job-interview preparation, not lead generation. Replaced with data-analyst (used by the qualification-scoring step in the prompt). schema.toml: register mcp_servers / skills / max_history_messages on the agent field schema so machine consumers (RegistrySchema in librefang-types) see the new top-level fields. The max_history_messages description now points at librefang_runtime::agent_loop::DEFAULT_MAX_HISTORY_MESSAGES (60 today) by name, so the schema doesn't go stale when the constant moves again. agents/README.md: example block + "Adding a New Agent" checklist mention the allowlists; max_history_messages example is shown commented out with a prompt-cache caveat. Open items ---------- `assistant` (the default user-facing agent) keeps `skills = []` deliberately. It is the generalist entry point — capping its skill surface at a small allowlist would defeat its "delegate to any specialist" job. The trade-off is that this single agent still pays the full skill-definition load on every turn; operators who want a strict allowlist for `assistant` can override it after install. Why not adopt PR #89's approach ------------------------------- #89 covers similar ground but with three issues this PR avoids: 1. mcp_servers = ["_none"] sentinel. #89's body explicitly notes it's pending upstream librefang#4808 (mcp_disabled). Shipping a magic-string today means coming back later to clean it up. This PR uses real allowlists. 2. max_history_messages = 8 / 12 / 15 / 20. Far below today's kernel default (60) and #91's direction for long-workflow hands (80–120). Every turn that hits the cap invalidates the cached prompt prefix; the cost of cache misses exceeds the saving from shorter history. This PR uses 60–120. 3. Doubling max_llm_tokens_per_hour (coder 200k→500k, assistant 300k→500k) widens the per-agent budget — the opposite direction from #87's "reduce per-call cost" goal. Left to the operator's instance-specific tuning. Refs #87, #89

houko force-pushed the fix/agent-mcp-skills-allowlists branch 2 times, most recently from f9a9866 to e5645b8 Compare May 12, 2026 00:13

houko changed the title ~~fix(agents): add mcp_servers / skills / max_history_messages allowlists (#87)~~ fix(agents,hands): per-agent/per-hand mcp_servers / skills allowlists (#87) May 12, 2026

houko force-pushed the fix/agent-mcp-skills-allowlists branch from e5645b8 to f13c9b9 Compare May 12, 2026 00:25

houko merged commit 102b506 into main May 12, 2026
3 checks passed

houko deleted the fix/agent-mcp-skills-allowlists branch May 12, 2026 00:30

houko mentioned this pull request May 12, 2026

feat(agents,hands): per-agent skills, mcp_servers, profile, budget, and history limits #89

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents,hands): per-agent/per-hand mcp_servers / skills allowlists (#87)#92

fix(agents,hands): per-agent/per-hand mcp_servers / skills allowlists (#87)#92
houko merged 1 commit into
mainfrom
fix/agent-mcp-skills-allowlists

houko commented May 12, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

houko commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed (51 files, +462/-8)

32 agents/*/agent.toml

17 hands/*/HAND.toml

schema.toml + agents/README.md

Open items / explicit trade-offs

Estimated impact (per #87)

Why not adopt #89

Verification

Notes for reviewers

Uh oh!

chatgpt-codex-connector Bot commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

houko commented May 12, 2026 •

edited

Loading

32 `agents/*/agent.toml`

17 `hands/*/HAND.toml`

`schema.toml` + `agents/README.md`