Skip to content

feat: Add strict tool mode diagnostics and report contracts#1

Closed
JasonZHANGTianrui wants to merge 466 commits into
mainfrom
zhangtianrui1/strict-tool-mode
Closed

feat: Add strict tool mode diagnostics and report contracts#1
JasonZHANGTianrui wants to merge 466 commits into
mainfrom
zhangtianrui1/strict-tool-mode

Conversation

@JasonZHANGTianrui
Copy link
Copy Markdown

Implements strict tool mode plumbing across embedded runs, transport streams, and OpenAI-compatible gateway responses.

Changes include:

Adds strict/warn/off tool strictness handling for tool argument repairs, alias repairs, and tool-name normalization.
Surfaces structured toolStrictnessReport / tool_strictness_report diagnostics.
Wires config/runtime strictness mode into embedded agent runs and provider transports.
Adds mock/no-API test coverage plus testing docs for real API validation.
Updates config baseline hash for the new config surface.
Verification:

pnpm config:docs:check
pnpm check
pnpm build
Targeted strict-tool-mode tests
src/channels/plugins/read-only.test.ts rerun with low concurrency
Note: full default-parallel pnpm test hit local no-output timeouts across unrelated shards; the failed shard passed when rerun with OPENCLAW_VITEST_MAX_WORKERS=1.

steipete and others added 30 commits May 6, 2026 02:46
docs/install/clawdock.md: renamed '## Related pages' to '## Related'
for consistency with sibling install docs and converted the 3-bullet
list into a CardGroup linking docker, docker-vm-runtime, and updating.

docs/install/nix.md: replaced 2 typography characters with ASCII
equivalents and converted the 3-bullet Related list into a CardGroup,
adding an Updating card so readers wiring nix-openclaw next to a
managed install see the upgrade path.

docs/concepts/features.md: converted the 2-bullet Related list into a
CardGroup, adding cross-links to channels and plugins so the page now
points readers at both deeper concepts (experimental features, agent
runtime) and direct surfaces (channels, plugins).

docs/tools/pdf.md: replaced 2 typography characters with ASCII
equivalents.
docs/install/macos-vm.md: removed the duplicate '# OpenClaw on macOS
VMs (Sandboxing)' H1 (Mintlify renders title from frontmatter; the
in-body H1 plus parens produced a brittle anchor).

docs/install/development-channels.md: removed the duplicate
'# Development channels' H1.

docs/install/index.md: replaced 3 typography characters (curly quotes
and en-dash) with ASCII equivalents.

docs/concepts/delegate-architecture.md: replaced 10 typography
characters (curly quotes, apostrophes, em/en dashes) with ASCII
equivalents.
Fix iOS LAN/setup-code pairing policy for openclaw#47887.

- Allow explicit private LAN and .local plaintext ws:// setup/manual connects where policy allows it.
- Keep public hosts, .ts.net, and Tailscale CGNAT plaintext fail-closed.
- Prefer explicit passwords over stale bootstrap tokens in Swift and TypeScript gateway clients.
- Update setup-code/device-pair coverage, docs, and changelog with source credit for openclaw#65185.

Verification:
- pnpm install
- git diff --check origin/main..HEAD
- pnpm exec oxfmt --check --threads=1 src/gateway/client.ts src/gateway/client.test.ts src/pairing/setup-code.ts src/pairing/setup-code.test.ts extensions/device-pair/index.ts extensions/device-pair/index.test.ts
- pnpm format:docs:check
- pnpm test src/gateway/client.test.ts src/pairing/setup-code.test.ts extensions/device-pair/index.test.ts
- cd apps/shared/OpenClawKit && swift test --filter 'DeepLinksSecurityTests|GatewayNodeSessionTests'
- pnpm lint:swift passes with the existing TalkModeRuntime.swift type-body-length warning

Blocked locally:
- iOS app-target xcodebuild tests require unavailable watchOS 26.4 runtime here.
- Testbox check:changed previously failed because the image lacks swiftlint; local swiftlint passes.
Replaced 152 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents so grep,
copy-paste, and Mintlify search hit clean tokens. Per docs/CLAUDE.md
heading and content hygiene rules.

- docs/gateway/security/index.md: 59 chars
- docs/plugins/hooks.md: 34 chars
- docs/reference/session-management-compaction.md: 30 chars
- docs/tools/clawhub.md: 29 chars
Replaced 138 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules so grep, copy-paste,
and Mintlify search hit clean tokens.

- docs/reference/AGENTS.default.md: 29 chars, plus removed the
  duplicate '# AGENTS.md - OpenClaw Personal Assistant (default)' H1
  (Mintlify renders title from frontmatter; the in-body H1 with
  parens and a bare hyphen produced a brittle anchor).
- docs/help/testing-live.md: 29 chars
- docs/tools/image-generation.md: 28 chars
- docs/channels/index.md: 27 chars
- docs/tools/video-generation.md: 25 chars
Replaced 112 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules.

- docs/help/gpt55-codex-agentic-parity.md: 22 chars; removed the
  duplicate '# GPT-5.5 / Codex Agentic Parity in OpenClaw' H1 (Mintlify
  renders the title from frontmatter; the in-body H1 with the slash
  produced a brittle anchor).
- docs/platforms/mac/menu-bar.md: 21 chars; removed the duplicate
  '# Menu Bar Status Logic' H1.
- docs/tools/acp-agents.md: 23 chars
- docs/concepts/qa-matrix.md: 23 chars
- docs/concepts/qa-e2e-automation.md: 23 chars
…aw#73839)

Merged via squash.

Prepared head SHA: d554201
Co-authored-by: bradhallett <53977268+bradhallett@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
Replaced 98 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules.

- docs/plugins/sdk-migration.md: 20 chars
- docs/help/testing.md: 20 chars
- docs/automation/tasks.md: 20 chars
- docs/plugins/sdk-channel-plugins.md: 19 chars
- docs/channels/yuanbao.md: 19 chars; removed the duplicate '# Yuanbao'
  H1 (Mintlify renders title from frontmatter).
Replaced 92 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules.

- docs/channels/feishu.md: 19 chars; removed the duplicate
  '# Feishu / Lark' H1 (Mintlify renders title from frontmatter; the
  in-body H1 with a slash produced a brittle anchor).
- docs/gateway/bonjour.md: 18 chars; removed the duplicate
  '# Bonjour / mDNS discovery' H1.
- docs/channels/matrix.md: 19 chars
- docs/tools/browser.md: 18 chars
- docs/automation/standing-orders.md: 18 chars
Preserve visible assistant text from mixed text/tool-use transcript turns in chat.history while keeping commentary-only assistant turns hidden.

Fixes openclaw#77374.

Verification:
- pnpm test src/gateway/server-methods/server-methods.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts
- pnpm exec oxfmt --check --threads=1 src/gateway/chat-display-projection.ts src/gateway/server-methods/server-methods.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts
- git diff --check
- pnpm changed:lanes --json
- PR CI passed on 048266c
Fixes openclaw#76957.

Restores the Control UI /new hook lifecycle through an explicit sessions.create emitCommandHooks opt-in, preserving hook-free defaults for programmatic parent-session creates.

Validation:
- pnpm protocol:check
- pnpm test src/gateway/server.sessions.reset-hooks.test.ts ui/src/ui/app-render.helpers.node.test.ts
- pnpm exec oxlint on touched TS files
- pnpm exec oxfmt --check --threads=1 on touched files
- git diff --check
- OPENCLAW_LOCAL_CHECK=1 OPENCLAW_LOCAL_CHECK_MODE=throttled env NODE_OPTIONS=--max-old-space-size=4096 pnpm check:changed
- GitHub PR checks green on 3a446ec
- ClawSweeper re-review completed with no blocking findings and security cleared

Duplicate triage:
- openclaw#77376, openclaw#77004, and openclaw#76967 were superseded closed attempts for openclaw#76957
- openclaw#77562 is a closed duplicate issue
- openclaw#77880 mentions openclaw#76957 but is not a duplicate of this hook fix
Replaced 80 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules.

- docs/plugins/sdk-entrypoints.md: 17 chars
- docs/help/index.md: 17 chars
- docs/concepts/agent-workspace.md: 16 chars
- docs/tools/lobster.md: 15 chars
- docs/tools/exec-approvals.md: 15 chars
steipete and others added 27 commits May 6, 2026 12:08
Summary:
- The PR adds a `before_agent_run` plugin hook with pass/block decisions, redacted blocked-turn persistence, diagnostics/docs/changelog updates, and focused runner, gateway, session, and plugin tests.
- Reproducibility: not applicable. as a feature PR rather than a current-main bug report. Current main lacks ` ... un`, while the PR head adds source coverage and copied live Gateway/WebChat log proof for the new behavior.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix: trim before agent hook PR scope
- PR branch already contained follow-up commit before automerge: fix: keep before-agent blocks redacted
- PR branch already contained follow-up commit before automerge: fix: keep runtime context out of model prompt
- PR branch already contained follow-up commit before automerge: docs: refresh config baseline after rebase
- PR branch already contained follow-up commit before automerge: fix: align blocked turn clients with redacted content
- PR branch already contained follow-up commit before automerge: fix: remove out-of-scope client block UI changes

Validation:
- ClawSweeper review passed for head 767e46f.
- Required merge gates passed before the squash merge.

Prepared head SHA: 767e46f
Review: openclaw#75035 (comment)

Co-authored-by: Jesse Merhi <jessejmerhi@gmail.com>
Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Summary:
- The PR changes the shared conversation-label generator to send label instructions as `systemPrompt`, omit `temperature` for Codex simple completions, log error stop reasons, and add focused tests plus a changelog entry.
- Reproducibility: yes. Source reproduction is high-confidence: current main sends the prompt only inside user ... ple transport reads instructions from `context.systemPrompt` and only includes `temperature` when supplied.

Automerge notes:
- PR branch already contained follow-up commit before automerge: docs: note Codex topic label fix

Validation:
- ClawSweeper review passed for head 9380907.
- Required merge gates passed before the squash merge.

Prepared head SHA: 9380907
Review: openclaw#78450 (comment)

Co-authored-by: Clever <clever@users.noreply.github.com>
Keep startup-derived plugin enablement, gateway auth tokens, control UI origins, and owner-display secrets runtime-only instead of persisting them into openclaw.json.

Refuse config writers, mutating update/plugin lifecycle commands, and doctor repair/token generation in Nix mode with agent-first nix-openclaw guidance.

Verification:
- pnpm check
- pnpm build
- pnpm test -- src/config/io.write-config.test.ts src/config/mutate.test.ts src/config/io.owner-display-secret.test.ts src/gateway/server-startup-config.recovery.test.ts src/gateway/startup-auth.test.ts src/gateway/startup-control-ui-origins.test.ts src/cli/plugins-cli.install.test.ts src/cli/plugins-cli.policy.test.ts src/cli/plugins-cli.uninstall.test.ts src/cli/plugins-cli.update.test.ts src/cli/update-cli.test.ts src/auto-reply/reply/commands-plugins.install.test.ts src/auto-reply/reply/commands-plugins.test.ts src/commands/onboarding-plugin-install.test.ts src/commands/doctor.runs-legacy-state-migrations-yes-mode-without.e2e.test.ts src/commands/doctor/shared/codex-route-warnings.test.ts src/commands/doctor/repair-sequencing.test.ts src/agents/auth-profile-runtime-contract.test.ts src/auto-reply/reply/agent-runner-execution.test.ts
- GitHub CI green on 05a2c71

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Alex Knight <15041791+amknight@users.noreply.github.com>
@JasonZHANGTianrui JasonZHANGTianrui force-pushed the zhangtianrui1/strict-tool-mode branch from badf014 to 372ef49 Compare May 6, 2026 15:30
@wzhgba wzhgba closed this May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.