fix: harden session-scoped permissions and remove root temp scope by ideas24h · Pull Request #25 · cosmicstack-labs/mercury-agent

ideas24h · 2026-04-28T10:03:14Z

Summary

isolate permission state by session/channel instead of sharing a global allow-all state
keep Allow All restricted by shell blocklist and filesystem scoping, including scheduled/system runs
remove the vestigial root temp scope path and align CLI, Telegram, manual, and docs with the hardened model
add regression tests plus sandbox smoke docs/scripts for the permission model

Test Plan

npm test -- src/core/agent-permissions.test.ts src/core/permission-mode.test.ts src/capabilities/permissions.test.ts src/channels/telegram.test.ts

- isolate permission state by session/channel - keep Allow All restricted by shell blocklist and filesystem scoping - remove vestigial addRootTempScope contract/runtime path - align CLI/Telegram/manual/docs with the hardened model - add regression tests for permission mode and scheduled/system behavior

Copilot

Pull request overview

This PR hardens Mercury’s permission model by scoping “Allow All” and related permission state to the active session/channel, removing the previous “root temp scope” behavior, and aligning CLI/Telegram UX + docs/tests with the new model.

Changes:

Introduce session/channel-scoped permission state in PermissionManager and plumb channel context into approval prompts.
Remove the implicit addTempScope('/') behavior for “Allow All” and for system/scheduled runs, while keeping shell blocklist + filesystem scoping enforced.
Add regression tests plus sandbox smoke docs/scripts to validate the permission wiring and CLI streaming behavior.

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/utils/manual.ts	Updates user-facing manual text to reflect session/channel-scoped “Allow All” and restricted scoping.
src/index.ts	Wires permission prompting with context + applies session permission mode for CLI/Telegram sessions.
src/core/permission-mode.ts	Adds helper to apply permission mode scoped to a specific session/channel.
src/core/permission-mode.test.ts	Tests that applying “Allow All” does not grant filesystem root scope and stays channel-scoped.
src/core/agent.ts	Ensures per-message permission context is set via `setCurrentChannel` and system/scheduled auto-approval doesn’t imply root scope.
src/core/agent-permissions.test.ts	Adds tests for system/scheduled permission policy contract.
src/channels/telegram.ts	Updates Telegram permission mode copy to match the hardened model.
src/channels/telegram.test.ts	Verifies permission prompts are sent to the targeted Telegram chat.
src/channels/cli.ts	Adjusts CLI copy for permission modes and removes duplicate final streamed output behavior.
src/channels/cli.test.ts	Adds regression test ensuring streamed content isn’t printed twice in TTY mode.
src/capabilities/permissions.ts	Implements session/channel isolation (auto-approve, pending approvals, temp scopes) and passes context into ask handler.
src/capabilities/permissions.test.ts	Adds tests for session isolation + enforcement of blocklist/cwdOnly even under allow-all.
scripts/run_mercury_sandbox_smoke.sh	Adds a smoke runner script for sandbox verification.
scripts/mercury_sandbox_smoke.py	Adds a PTY/pexpect-based smoke test driver capturing transcripts and validating a minimal response.
docs/permissions-model.md	Documents the live session/channel permission model and supporting evidence/tests.
docs/mercury-sandbox-smoke.md	Documents how to run the sandbox smoke test.
README.md	Updates permission model description + adds a “Permission modes” section and links to detailed docs.
DECISIONS.md	Updates scheduler ADR to reflect system messages preserving source channel context.
CHANGELOG.md	Aligns changelog entries with restricted allow-all + scheduled task behavior.
.gitignore	Ignores additional generated dirs/files (e.g., tmp/ used by smoke transcripts).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+  private askHandler?: (prompt: string, context: PermissionAskContext) => Promise<string>;
+  private readonly sessionStates: Map<string, SessionPermissionsState> = new Map();
+  private currentChannelId = 'cli:default';


+REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+SANDBOX_HOME="${MERCURY_SANDBOX_HOME:-/home/raul/dev/mercury-test/sandbox/mercury-home}"
+WORKSPACE="${MERCURY_SANDBOX_WORKSPACE:-/home/raul/dev/mercury-test/sandbox/workspace}"
+TRANSCRIPTS_DIR="${MERCURY_SMOKE_TRANSCRIPTS_DIR:-$REPO_ROOT/tmp/mercury-smoke}"


+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description="Smoke test reproducible para Mercury sandbox (glm-5.1).")
+    parser.add_argument("--workspace", required=True, type=Path)
+    parser.add_argument("--mercury-home", required=True, type=Path)


+# Mercury sandbox smoke test
+
+Smoke test reproducible para validar Mercury dentro del sandbox con `glm-5.1`, sin tocar la lógica del agente.
+


+- sandbox existente en:
+  - `MERCURY_SANDBOX_HOME=/home/raul/dev/mercury-test/sandbox/mercury-home`
+  - `MERCURY_SANDBOX_WORKSPACE=/home/raul/dev/mercury-test/sandbox/workspace`


ideas24h · 2026-04-28T13:11:02Z

Addressed the Copilot review items.

sessionStates growth

Added a bounded session-state cache in PermissionManager with a simple LRU-style eviction strategy (MAX_SESSION_STATES = 100).
Preserved per-channel isolation.
Added tests for bounded growth and eviction behavior.

Hardcoded sandbox paths

Removed /home/raul/... defaults from the smoke runner.
The script now derives portable repo-relative defaults:
- $REPO_ROOT/sandbox/...
- fallback: $REPO_ROOT/../sandbox/...
Explicit overrides via MERCURY_SANDBOX_HOME / MERCURY_SANDBOX_WORKSPACE still work.

Script help/errors in Spanish

Translated user-facing help and error messages in scripts/mercury_sandbox_smoke.py to English.

Smoke doc in Spanish + local-only examples

Translated docs/mercury-sandbox-smoke.md to English.
Replaced developer-specific path examples with portable/environment-based instructions.

Verification run:

npm test -- src/capabilities/permissions.test.ts src/core/agent-permissions.test.ts
bash -n scripts/run_mercury_sandbox_smoke.sh
python3 -m py_compile scripts/mercury_sandbox_smoke.py
python3 scripts/mercury_sandbox_smoke.py --help

The full end-to-end smoke path is currently blocked by an external 404 from the z.ai responses endpoint, not by these changes.

ideas24h added 2 commits April 28, 2026 10:45

fix: stop duplicating CLI streamed output

5d3b6cb

Copilot AI review requested due to automatic review settings April 28, 2026 10:03

Copilot started reviewing on behalf of ideas24h April 28, 2026 10:03 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

ideas24h added 3 commits April 28, 2026 12:32

Merge origin/main into fix/cli-stream-duplicate-output

3c17f2f

fix: bound permission session state growth

d6c8b7c

docs: make mercury sandbox smoke portable

31f340a

ideas24h added 2 commits April 28, 2026 18:12

fix: align permission modes with filesystem scoping

ca57405

fix: harden mercury sandbox smoke for z.ai

a89caf1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: harden session-scoped permissions and remove root temp scope#25

fix: harden session-scoped permissions and remove root temp scope#25
ideas24h wants to merge 7 commits into
cosmicstack-labs:mainfrom
ideas24h:fix/cli-stream-duplicate-output

ideas24h commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

ideas24h commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Mercury sandbox smoke test

		Smoke test reproducible para validar Mercury dentro del sandbox con `glm-5.1`, sin tocar la lógica del agente.

Uh oh!

Conversation

ideas24h commented Apr 28, 2026

Summary

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

ideas24h commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants