Skip to content

Add MCP trace replay artifact#147

Merged
ProfRandom92 merged 2 commits into
mainfrom
codex/add-mcp-trace-replay-artifact
May 20, 2026
Merged

Add MCP trace replay artifact#147
ProfRandom92 merged 2 commits into
mainfrom
codex/add-mcp-trace-replay-artifact

Conversation

@ProfRandom92
Copy link
Copy Markdown
Owner

Motivation

  • Provide a deterministic, committed artifact for the mcp_trace_replay fixture family that proves whether MCP-style tool traces preserve tool order, validation-before-action, dependency chains, recovery paths, and capability boundaries under replay reconstruction.

Description

  • Add a deterministic generator script scripts/generate_mcp_trace_replay_artifact.py that reuses DegradationCurveGenerator to evaluate the mcp_trace_replay family and emits stable, ordered fixture entries for levels baseline, mild, moderate, severe.
  • Commit the artifact artifacts/mcp_trace_replay_results.json with formatted decimal-string overall scores, sorted contracts/labels, and a non-LLM/non-external summary block.
  • Add reproducibility and schema tests in tests/test_mcp_trace_replay_artifact.py that assert exact match with the committed artifact, stable schema (no time/env fields), deterministic ordering, and manifest-aligned labels/admissibility.
  • Add generate:mcp-trace-replay npm script to follow existing generator script conventions and keep the PR scope limited to artifact generation and tests.

Testing

  • Ran the generator: python scripts/generate_mcp_trace_replay_artifact.py and inspected artifacts/mcp_trace_replay_results.json which was produced deterministically and matches the committed file.
  • Ran focused tests: pytest tests/test_mcp_trace_replay_artifact.py -q (5 passed), pytest tests/test_fixture_manifest.py -q (8 passed), pytest tests/test_multi_family_admissibility_artifact.py -q (6 passed), and pytest tests/test_failure_taxonomy.py -q (4 passed), all succeeding.
  • Ran full project validation: npm run check which executed layout/typecheck/validate/build/test and completed with the test suite passing (227 passed).

Codex Task

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new artifact generation system for MCP trace replays. It includes a JSON artifact containing results for various degradation levels, a Python script to generate this artifact deterministically, and a comprehensive test suite to ensure schema stability and data integrity. Feedback focuses on improving the robustness of the generation script by using absolute paths anchored to the repository root for output and manifest files, ensuring the script functions correctly regardless of the execution context.

ARTIFACT_ID = "mcp_trace_replay_results_v1"
FAMILY = "mcp_trace_replay"
CURVE_LEVELS = ("baseline", "mild", "moderate", "severe")
OUTPUT_PATH = Path("artifacts/mcp_trace_replay_results.json")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The OUTPUT_PATH is currently defined as a relative path, which makes the script sensitive to the current working directory. Since REPO_ROOT is already calculated, it is better to anchor all paths to it for robustness, especially if the script is executed from a subdirectory. I also recommend defining MANIFEST_PATH here to ensure the generator uses the correct manifest file regardless of the execution context.

Suggested change
OUTPUT_PATH = Path("artifacts/mcp_trace_replay_results.json")
OUTPUT_PATH = REPO_ROOT / "artifacts" / "mcp_trace_replay_results.json"
MANIFEST_PATH = REPO_ROOT / "fixtures" / "manifest.json"


def generate_mcp_trace_replay_artifact(output_path: Path = OUTPUT_PATH) -> Path:
generator = DegradationCurveGenerator()
fixtures = generator.fixtures_for_manifest_family(FAMILY, levels=CURVE_LEVELS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To ensure the script is fully independent of the working directory, pass the absolute MANIFEST_PATH to the generator. This ensures that DegradationCurveGenerator looks for the manifest in the repository root rather than relative to the current shell location.

Suggested change
fixtures = generator.fixtures_for_manifest_family(FAMILY, levels=CURVE_LEVELS)
fixtures = generator.fixtures_for_manifest_family(FAMILY, levels=CURVE_LEVELS, manifest_path=MANIFEST_PATH)

@ProfRandom92 ProfRandom92 merged commit a761d95 into main May 20, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant