Skip to content

test(query): add integration tests for per-tenant AGE graph routing#583

Draft
jsell-rh wants to merge 1148 commits into
mainfrom
hyperloop/task-109
Draft

test(query): add integration tests for per-tenant AGE graph routing#583
jsell-rh wants to merge 1148 commits into
mainfrom
hyperloop/task-109

Conversation

@jsell-rh
Copy link
Copy Markdown
Collaborator

@jsell-rh jsell-rh commented May 3, 2026

What & Why

The Per-Tenant Graph Routing requirement added to specs/query/query-execution.spec.md
defines two scenarios:

"GIVEN a valid MCP API key associated with tenant A WHEN the query_graph tool is
invoked THEN the query executes against the AGE graph named tenant_{tenant_A_id}
AND data from tenant B's graph is not accessible"

"GIVEN a tenant whose AGE graph does not yet exist WHEN the query_graph tool is
invoked THEN the server returns a structured error indicating the knowledge graph
context is unavailable (NOT a raw database error)"

The routing implementation is complete — TenantAwareQueryGraphRepository wraps
QueryGraphRepository, resolves tenant_{tenant_id} as the graph name, checks
existence via AGEGraphExistenceChecker, and rejects queries before execution if
the graph is absent. Unit tests in tests/unit/query/test_query_repository.py
(TestTenantGraphRouting) confirm the logic at the repository level.

What is missing is end-to-end integration coverage exercising the full call chain:
API key auth middleware → get_mcp_query_service() dependency → TenantAwareQueryGraphRepository
→ real PostgreSQL/AGE. Without this, a regression anywhere in the wiring (e.g.,
tenant_id not propagated from auth context, graph name format change) would
only be caught by production traffic.

Spec Requirements Satisfied

specs/query/query-execution.spec.md:

  • Requirement: Per-Tenant Graph Routing — Scenario: Query executes in tenant graph
  • Requirement: Per-Tenant Graph Routing — Scenario: Tenant graph not found

What This Change Does

Add integration tests in src/api/tests/integration/query/ (or extend
test_query_mcp.py) that exercise per-tenant routing against a real
PostgreSQL+AGE instance:

Test: test_query_executes_in_tenant_graph

Setup:

  1. Create two AGE graphs in the test database: tenant_alpha and tenant_beta.
  2. Insert a distinguishing node into tenant_alpha (e.g., (:Marker {name: 'alpha'})).
  3. Insert a different node into tenant_beta (e.g., (:Marker {name: 'beta'})).
  4. Obtain an API key scoped to tenant_id = "alpha".

Execution:

  • POST to the MCP query_graph tool with query: "MATCH (n:Marker) RETURN n".

Assertions:

  • Response is 200.
  • Result rows contain the alpha marker node.
  • Result rows do NOT contain the beta marker node.

Test: test_tenant_graph_not_found_returns_structured_error

Setup:

  1. Ensure no AGE graph named tenant_missing exists.
  2. Obtain an API key scoped to tenant_id = "missing".

Execution:

  • POST to the MCP query_graph tool with any valid Cypher.

Assertions:

  • Response is 200 (MCP protocol: errors are returned in the response body, not HTTP 4xx).
  • Response body is an MCP error structure (not a raw PostgreSQL exception).
  • Error message references the knowledge graph context being unavailable (not a raw
    psycopg2.ProgrammingError or similar).

Files / Areas Affected

  • src/api/tests/integration/query/test_tenant_routing.py (new) — the two integration
    test cases described above
  • src/api/tests/integration/conftest.py or a new fixtures module — fixtures for
    creating/dropping AGE graphs and issuing test API keys scoped to specific tenant IDs
  • No production code changes are expected; if a test reveals a real bug, fix it
    and note it in the PR description

Tests

The two integration tests ARE the deliverable. Mark them with @pytest.mark.integration
and ensure they run with make test-integration against the isolated dev instance.

Infrastructure requirements (provided by make instance-up):

  • PostgreSQL with Apache AGE extension loaded
  • Kartograph API running (for MCP HTTP endpoint)
  • A way to create/drop AGE graphs in the test database (direct psycopg2 connection
    or a test fixture that calls CREATE EXTENSION IF NOT EXISTS age + SELECT create_graph(...))

How to Verify

  1. make instance-up — start isolated test instance
  2. source .instances/$(basename $(pwd))/.env.instance
  3. cd src/api && uv run pytest tests/integration/query/test_tenant_routing.py -v -m integration
  4. Confirm both tests pass green

Caveats

  • AGE graph creation requires superuser or CREATE privilege; the test database user
    must have this privilege, or the fixture must use a superuser connection.
  • Tear down created graphs after each test to avoid cross-test pollution.
  • The TenantAwareQueryGraphRepository uses ag_catalog.ag_graph to check existence;
    the integration test implicitly validates this query works against the real AGE
    catalog, not just a mock.
  • If tenant_id is a UUID in production but a short string in tests, ensure the
    graph name format (tenant_{tenant_id}) is consistent with what get_mcp_query_service()
    actually constructs.

Task: task-109
Spec: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2

Merge

The orchestrator will squash-merge this PR automatically
once all pipeline steps pass.


This PR was created by hyperloop,
an AI agent orchestrator.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 1b4d3c6d-2b00-4ea5-b07d-9c53730d43a2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hyperloop/task-109

Comment @coderabbitai help to get the list of available commands and usage tips.

@jsell-rh jsell-rh force-pushed the hyperloop/task-109 branch 26 times, most recently from cf70eb5 to 0135af6 Compare May 4, 2026 07:00
jsell-rh and others added 30 commits May 5, 2026 18:42
Add task-151: integration tests for the MCP knowledge-graphs://accessible
resource. The Knowledge Graphs Resource requirement was added to
mcp-server.spec.md (commit 6bea455) alongside the Per-Tenant Graph Routing
requirement. Per-tenant routing received a dedicated integration test task
(task-150); the KG resource gap was not captured. Both requirements involve
cross-system coordination (SpiceDB + management DB) that warrants end-to-end
verification beyond unit fakes.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Task-Ref: intake
…— no new tasks

All requirements in the three modified specs are accounted for:

mcp-server.spec.md (2ac8d03):
- All 6 requirements fully implemented and tested.
- Remaining gaps: task-149 (503 auth unit tests), task-151 (KG resource
  integration tests).

query-execution.spec.md (dbcf0d7):
- All 5 requirements fully implemented and tested.
- Remaining gap: task-150 (per-tenant routing integration tests).

experience.spec.md (e77913c):
- All non-blocked requirements implemented with test coverage.
- Remaining gaps: task-147 (KG selector sentinel fix), task-148
  (failing test updates).
- Ontology Design and Sync Monitoring blocked by missing Extraction /
  Ingestion contexts — no tasks per guidelines.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…— no new tasks (pass 5)

All three specs verified line-by-line against the current codebase.
No new tasks warranted:

- mcp-server.spec.md (2ac8d03): all 6 requirements fully implemented
  and tested; gaps covered by task-149 (503 auth test) and task-151
  (KG resource integration test).

- query-execution.spec.md (dbcf0d7): all 5 requirements fully
  implemented and tested; gap covered by task-150 (per-tenant routing
  integration test).

- experience.spec.md (e77913c): all 18 requirements implemented in
  src/dev-ui with corresponding test files; open tasks 147/148 address
  the __all__ sentinel bug in the query console KG selector; Ontology
  Design backend deferred pending AIHCM-174.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ng sentinel (#630)

Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: task-148
…Integration, and API Key Management (#631)

Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: task-149
… new tasks (pass 6)

Line-by-line audit of all three specs against the current codebase.
Blob SHAs unchanged from prior passes:
  - specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
  - specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
  - specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da

All scenarios verified as implemented. Key findings:

mcp-server.spec.md:
  All 6 requirements / 22 scenarios implemented and tested.
  task-149 (503 auth test) is likely stale — tests already exist in
  test_mcp_auth_middleware.py (added in commit 54052d3).
  task-151 (KG resource integration tests) is likely stale — tests already
  exist in tests/integration/query/test_kg_resource.py (task-110).

query-execution.spec.md:
  All 5 requirements / 11 scenarios implemented and tested.
  task-150 (per-tenant routing integration tests) remains valid — no
  explicit integration test for "Tenant graph not found" or cross-tenant
  isolation exists yet.

experience.spec.md:
  All 18 requirements / 43 scenarios implemented. 2493 UI tests pass.
  task-147/task-148 (sentinel refactoring) are code-quality tasks;
  current implementation satisfies spec behavioral requirements.

No new tasks created.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
Implements task-150: verifies the Per-Tenant Graph Routing requirement
(specs/query/query-execution.spec.md) against a real PostgreSQL + Apache
AGE instance. Two test classes cover the two spec scenarios:

- TestQueryRoutedToTenantGraph: provisions a tenant_{uuid} AGE graph via
  AGEGraphProvisioner, seeds it with nodes, and confirms QueryGraphRepository
  and TenantAwareQueryGraphRepository route queries to the correct graph.
  A cross-tenant isolation test seeds graph A and asserts graph B returns no
  results.

- TestTenantGraphNotFound: leaves the tenant graph unprovisioned and asserts
  that both QueryGraphRepository and TenantAwareQueryGraphRepository raise
  QueryExecutionError before any Cypher reaches the database (verified by a
  recording fake inner repository).

All tests use autouse cleanup fixtures (pre- and post-test graph drop) to
prevent contamination across runs. No implementation files change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… new tasks (pass 7)

All three modified specs verified line-by-line against implementation and tests.

specs/query/mcp-server.spec.md (2ac8d03):
  25 scenarios — all implemented and unit-tested. Remaining gaps already
  captured: task-149 (503 auth test), task-151 (KG resource integration test).
  Self-hosted instance scenario confirmed covered (test_builds_enterprise_api_url,
  test_parses_github_enterprise_url). Probe correctly omits raw query from logs.

specs/query/query-execution.spec.md (dbcf0d7):
  13 scenarios — all implemented and unit-tested. Remaining gap already
  captured: task-150 (per-tenant routing integration tests, committed fadf7d1).

specs/ui/experience.spec.md (e77913c):
  All implementable scenarios verified against dev-ui pages, components, and
  tests. Sync logs endpoint confirmed present (routes.py:439). KG selector fix
  committed (60dd790). Ontology AI proposal and edit-after-extraction excluded
  (BLOCKED by Extraction/AIHCM-174 per guidelines). Tasks 147/148 capture
  remaining formal task-lifecycle work.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…ha-drift failures

task-134 and task-151 both failed only on check-no-test-regressions.sh pass 2
(alpha gained test_tenant_routing_integration.py after branches were cut). Both
verifiers correctly diagnosed alpha drift but did not emit the VERDICT:
REBASE-ONLY sentinel prescribed by rule 63, so the orchestrator re-routed to
implementers for zero-defect rebase operations.

Two targeted fixes:

1. verifier-overlay.yaml: New rule mandates that REBASE-ONLY verdicts begin
   with the exact machine-readable header lines:
     VERDICT: REBASE-ONLY
     DO NOT ROUTE TO IMPLEMENTER — orchestrator performs: git fetch origin && ...
   Without these tokens the orchestrator cannot distinguish a staleness issue
   from an implementation defect.

2. check-branch-rebased-on-alpha.sh: When within the 1-5 commit tolerance,
   emit an explicit WARNING that those alpha commits may contain new test files
   and recommend running check-no-test-regressions.sh standalone before the
   full backend suite. This surfaces the drift risk at the earliest possible
   moment rather than at the end of a long suite run.

Spec-Ref: .hyperloop/agents/process
Task-Ref: process-improvement
… new tasks (pass 8)

All three modified specs verified line-by-line against implementation and tests.

specs/query/mcp-server.spec.md (2ac8d03):
  25 scenarios — all implemented and unit-tested. All 5 requirements covered:
  query_graph tool (8 scenarios), fetch_documentation_source (5 scenarios, incl.
  self-hosted GitHub Enterprise/GitLab), Knowledge Graphs Resource (integration-
  tested in test_kg_resource.py — task-151 done), Agent Instructions Resource
  (fail-fast startup confirmed), MCP Authentication (401/503 unit-tested in
  test_mcp_auth_middleware.py — task-149 done), AGE single-column return (all 4
  row types tested in test_query_repository.py).

specs/query/query-execution.spec.md (dbcf0d7):
  10 scenarios — all implemented and integration-tested. Per-Tenant Graph Routing
  covered by TenantAwareQueryGraphRepository + AGEGraphExistenceChecker with full
  integration suite in test_tenant_routing_integration.py (task-150 done). Read-
  only enforcement: keyword blacklist, SET TRANSACTION READ ONLY, redacted logging
  (raw query never emitted), correlation IDs all verified in test_query_repository.py
  and test_application_services.py.

specs/ui/experience.spec.md (e77913c):
  ~60 scenarios across 15 requirements — all implemented and tested across 50+
  test files in src/dev-ui/app/tests/. KG selector uses '' sentinel (task-147
  done), KG selector tests updated (task-148 done). Ontology design wizard, deep-
  link routing, mutations console, navigation structure, graph explorer neighbor
  traversal, responsive layout, dark mode, design language — each exercised by
  dedicated test files. No implementation gaps found.

No new task files created.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
… new tasks (pass 9)

All three modified specs verified line-by-line against implementation and tests.
Tasks 147–151 are all done in code; orchestrator has not yet marked them completed.

specs/query/mcp-server.spec.md (2ac8d03):
  25 scenarios across 6 requirements — all implemented and unit-tested.
  query_graph tool: 8 scenarios (filter, enclave, write-reject, timeout, limit,
  truncation, internal-props, successful query) — all covered in test_mcp_tools.py,
  test_mcp_query_service.py, test_mcp_query_params.py.
  fetch_documentation_source: 5 scenarios (GitHub, GitLab, PAT headers, self-hosted,
  invalid URL) — covered in test_mcp_tools.py, git_repository tests.
  Knowledge Graphs Resource: 2 scenarios — unit-tested in
  test_mcp_knowledge_graphs_resource.py; integration-tested in
  tests/integration/query/test_kg_resource.py (task-151 done in code).
  Agent Instructions Resource: 2 scenarios — test_mcp_agent_instructions.py.
  MCP Authentication: 4 scenarios (API key, Bearer, 401, 503) — 503 tests
  confirmed in test_mcp_auth_middleware.py lines 318 and 615 (task-149 done).
  AGE single-column return: 4 scenarios — test_query_repository.py.

specs/query/query-execution.spec.md (dbcf0d7):
  10 scenarios across 5 requirements — all implemented and tested.
  Per-Tenant Graph Routing: TenantAwareQueryGraphRepository + integration tests
  in tests/integration/query/test_tenant_routing_integration.py (task-150 done).
  Read-only enforcement: keyword blacklist + SET TRANSACTION READ ONLY, redacted
  logging, correlation IDs — test_query_repository.py.
  Timeout: QueryTimeoutError with correlation_id — tested.
  Result Limiting: _ensure_limit appends/caps LIMIT — tested.
  Error Categorization: all 4 error types verified — test_mcp_query_service.py.

specs/ui/experience.spec.md (e77913c):
  ~60 scenarios across 15 requirements — all implemented and tested.
  KG selector sentinel: pages/query/index.vue uses '' (empty string), zero
  __all__ references remain in src/dev-ui/ (tasks 147/148 done in code).
  All other requirements covered by 50+ test files in src/dev-ui/app/tests/.

No new task files created.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…experience specs

All three specs unchanged (blob SHAs identical to passes 1–6). No new
implementation gaps found. Existing tasks 147–151 cover all identified
gaps; tasks 149 and 151 are likely stale (implementation already in
codebase), task 150 remains valid.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2 specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…experience specs

All three specs unchanged (blob SHAs identical to passes 1–7). No new
implementation gaps found.

New finding vs. pass 7: test_tenant_routing_integration.py is now fully
implemented (Task-Ref: task-150), making all five existing tasks stale:
- task-147/148: KG selector already uses '' sentinel; 2558/2558 tests pass
- task-149: 503 unit tests exist in test_mcp_auth_middleware.py
- task-150: tenant routing integration tests exist
- task-151: KG resource integration tests exist in test_kg_resource.py

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2 specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…pecs

Processed three modified specs against the full codebase. All scenarios
in query-execution.spec.md and ui/experience.spec.md are fully covered
by existing implementation and tests (including the work done by tasks
147–151, which are implemented though not yet marked complete).

One genuine gap identified in mcp-server.spec.md: the Bearer token
authentication scenario (Requirement: MCP Authentication) has unit-level
coverage but no integration test that exercises the full path — real JWT
validation against the OIDC provider, X-Tenant-ID tenant resolution, and
SpiceDB membership verification — through the actual MCP HTTP endpoint.
task-152 captures this gap.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…/experience specs

All three specs unchanged (blob SHAs identical to passes 1–9). No new
implementation gaps found beyond task-152 (Bearer token MCP auth
integration test), which was created in pass 9 and remains outstanding.

Python unit tests: 2993 passed. UI unit tests: 2558 passed (54 files).
Tasks 147–151 are all stale — their work is complete.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…/experience specs

All three specs unchanged (blob SHAs identical to passes 1–10). No new
implementation gaps found. task-152 (Bearer token MCP auth integration
test) remains the only outstanding work.

Recent merges since pass 10: task-150 (per-tenant routing integration
tests), task-147 (KG selector sentinel fix), task-149 UI alignment tests.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…/experience specs

Blob SHAs unchanged across all 12 passes. All requirements implemented
and tested. task-152 (Bearer token MCP auth integration test) remains
the sole outstanding item. No new tasks created.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…/experience specs

Blob SHAs unchanged across all 13 passes. All requirements implemented
and tested. task-152 (Bearer token MCP auth integration test) remains
the sole outstanding item. No new tasks created.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
…es and clean branches

Two rules added based on task-099 findings:

1. Verifier: When check-branch-rebased-on-alpha.sh exits non-zero AND
   implementation content is otherwise correct, emit VERDICT: REBASE-ONLY
   with the machine-readable header. The existing REBASE-ONLY rule only
   covers the case where the staleness check passes within tolerance but
   check-no-test-regressions.sh fails pass 2. This rule extends coverage to
   the case where the staleness check itself exits 1 — leaving no ambiguity
   about what the verifier should emit (task-099 emitted plain prose instead
   of the machine-readable header, risking implementer re-routing).

2. Implementer: After building a clean cherry-pick branch (-clean suffix),
   always run the three-step sequence (fetch → branch -f alpha → rebase alpha)
   immediately before the backend suite. A -clean branch can be 20+ commits
   stale by submission if other tasks merged to alpha after construction.

Spec-Ref: .hyperloop/agents/process
Task-Ref: process-improvement
…/experience specs

Blob SHAs unchanged across all 14 passes. All requirements implemented
and tested. task-148 (query console KG selector test updates) and
task-152 (Bearer token MCP auth integration test) remain the sole
outstanding items. No new tasks created.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… ui/experience specs

All scenarios in the three modified specs are implemented and tested.

mcp-server.spec.md (2ac8d03):
  - All 25 scenarios implemented; unit-tested comprehensively
  - Integration gaps tracked: task-149 (503 auth — done in middleware tests),
    task-151 (KG resource — done in test_kg_resource.py),
    task-152 (Bearer token MCP auth — only remaining gap, not-started)

query-execution.spec.md (dbcf0d7):
  - All 13 scenarios implemented and unit-tested
  - Integration gap tracked: task-150 (per-tenant routing — done in
    test_tenant_routing_integration.py)

ui/experience.spec.md (e77913c):
  - All ~40 scenarios across 18 requirements implemented; 2558 tests passing
  - UI sentinel fix (tasks 147/148) already applied in pages/query/index.vue
    (uses '' not '__all__'; all selector tests pass)

No new tasks created — all gaps are captured in existing tasks 147-152.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
… ui/experience specs

Verified all three spec files against the codebase line-by-line:

specs/query/mcp-server.spec.md (SHA: 2ac8d03)
- query_graph tool: all 8 scenarios implemented and tested
- fetch_documentation_source: GitHub, GitLab, self-hosted, private repos, invalid URLs all covered
- Knowledge Graphs resource: implemented; integration tests pending (task-151)
- Agent Instructions resource: fail-fast via PromptRepository._validate_required_files ✅
- MCP Authentication: 4 scenarios implemented; integration gaps in tasks 149, 152
- AGE single-column return: all 4 scenarios in QueryGraphRepository._row_to_dict ✅

specs/query/query-execution.spec.md (SHA: dbcf0d7)
- Per-Tenant Graph Routing: _validate_graph_exists + client graph_name; integration tests in task-150
- Read-Only Enforcement: SET TRANSACTION READ ONLY (primary) + keyword blacklist (secondary) ✅
- Timeout Enforcement: statement_timeout + QueryTimeoutError ✅
- Result Limiting: _ensure_limit with 1000 default, 10000 cap ✅
- Error Categorization: forbidden/timeout/execution_error/unknown_error all typed ✅

specs/ui/experience.spec.md (SHA: e77913c)
- All 15 requirements implemented: navigation, tenant/workspace context, KG creation,
  data source wizard (including ontology design), sync monitoring, MCP integration page,
  query console (selectedKgId uses '' empty-string sentinel), schema browser, graph explorer,
  mutations console (KG-scoped submission via canSubmitMutations + applyMutations), API key
  management, workspace management, design language, interaction principles, responsive layout,
  dark mode toggle ✅
- tasks 147/148 (query console KG selector): code already uses '' and <SelectItem value="">
  consistent with task spec — may be no-ops when orchestrator runs them

No new tasks created. All gaps captured in tasks 147–152.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
… ui/experience specs

Verified all requirements in the three modified specs line-by-line against
the current codebase. Findings:

mcp-server.spec.md (2ac8d03):
  - All tools/resources/auth implemented ✅
  - 503 path now unit-tested in test_mcp_auth_middleware.py ✅
  - KG resource integration tests present (test_kg_resource.py) ✅
  - Bearer token MCP auth integration test still missing → task-152 (existing)

query-execution.spec.md (dbcf0d7):
  - All read-only, timeout, limit, error-type requirements implemented ✅
  - Per-tenant routing integration tests present (test_tenant_routing_integration.py) ✅

ui/experience.spec.md (e77913c):
  - query/index.vue uses '' sentinel (not __all__) ✅
  - No test files reference __all__ ✅
  - task-149-spec-alignment.test.ts covers sync monitoring, MCP integration,
    API key management scenarios ✅
  - All other UI requirements covered by existing tasks and code ✅

No new task files created. task-152 (Bearer token MCP auth integration test)
remains the sole outstanding implementation item not yet present in the codebase.

Spec-Ref: specs/query/mcp-server.spec.md@2ac8d03afbf2153e3b569f1289e10b5ad5d21d6e
Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Spec-Ref: specs/ui/experience.spec.md@e77913c2cc6d8b719291e2dbb6870519a94d50da
Task-Ref: intake
The PM was committing task files and intake logs to alpha (trunk), polluting
trunk history and leaking state into task branch PRs. These files are
managed by the orchestrator on the hyperloop/state branch — they should
never be committed to trunk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pecs

- Add Graph Visualizer requirement with detailed Cosmograph implementation
  spec (promoted from /util/graph-viewer dev route)
- Add Mutations Console requirement and nav entry
- Add per-tenant graph routing requirement to query-execution spec
- Add knowledge_graphs://accessible MCP resource to mcp-server spec
- Update primary navigation to include Graph Visualizer and Mutations Console

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements two integration test scenarios from specs/query/query-execution.spec.md
(Requirement: Per-Tenant Graph Routing):

1. test_query_executes_in_tenant_graph
   Provisions two tenant AGE graphs (tenant_a_<rand> and tenant_b_<rand>),
   writes a Person node only to tenant A, and asserts that:
   - QueryGraphRepository for tenant A returns exactly the expected node.
   - QueryGraphRepository for tenant B returns zero rows (cross-tenant isolation).
   The isolation is enforced by the AGE cypher('graph_name', …) routing; rows
   in one tenant graph cannot leak into another tenant's repository.

2. test_tenant_graph_not_found_raises_before_db
   Uses TenantAwareQueryGraphRepository + AGEGraphExistenceChecker against the
   real ag_catalog.ag_graph catalog with a randomly-generated ghost tenant ID.
   Asserts that QueryExecutionError is raised before any Cypher is delegated to
   the inner QueryGraphRepository. Because the inner repository targets the
   existing test_graph (which would succeed if called), the test is
   self-verifying: incorrect delegation would return results rather than raise.

Tests are marked @pytest.mark.integration and require a running PostgreSQL+AGE
instance (make instance-up).

Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Task-Ref: task-109
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous integration tests in test_tenant_routing.py exercised only
the infrastructure layer (QueryGraphRepository and TenantAwareQueryGraphRepository
called directly). This meant regressions in the HTTP stack — such as
get_mcp_query_service() not propagating tenant_id from MCPAuthContext, or
_build_error_response dropping error_type — would pass silently.

Add TestPerTenantGraphRoutingHTTP with two tests that exercise the full
call chain: API key auth middleware → get_mcp_query_service() → TenantAware
QueryGraphRepository → real PostgreSQL/AGE.

test_query_executes_in_tenant_graph:
  Creates two tenants (A and B) and their API keys directly in the DB (no
  Keycloak needed), provisions both AGE graphs with distinct seed data, calls
  query_graph via HTTP with tenant_A's API key, and asserts that Alice
  (tenant_A's data) is returned while Bob (tenant_B's data) is absent.

test_tenant_graph_not_found_returns_structured_error:
  Creates a tenant whose AGE graph is deliberately not provisioned, calls
  query_graph via HTTP with that tenant's API key, and asserts the response
  carries success=False and error_type="execution_error".

The existing infrastructure-layer tests are retained as supplementary
coverage (per reviewer recommendation).

Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Task-Ref: task-109
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two assertions in test_mcp_query_service.py used literal OR-chained
conditions on result.message fields, triggering check-partial-error-assertions.sh:

- test_forbidden_error_message_is_propagated: split
  `"DELETE" in msg or "read-only" in msg.lower()` into two independent
  assertions. The actual error message "Query must be read-only. Found
  forbidden keyword: DELETE" contains both, so each assertion is correctly
  verifiable independently.

- test_execution_error_message_includes_original_error: split
  `"does not exist" in msg or "tenant" in msg.lower()` into two independent
  assertions. The actual error message "Tenant graph 'tenant_xyz' does not
  exist." contains both strings, so this strengthens the test.

This resolves check-partial-error-assertions.sh FAIL that was blocking
the merge of the per-tenant graph routing integration tests (task-109).

Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Task-Ref: task-109
…ed failures

The spec (specs/query/query-execution.spec.md, Scenario: Unexpected error)
requires error_type="unknown_error". A prior commit introduced the wrong
value "unexpected_error". This commit restores alignment with the spec
and with the merged main branch, fixing the merge conflict.

Spec-Ref: specs/query/query-execution.spec.md@dbcf0d7c2fa9c2456896ee20adbfdc8cc33090c2
Task-Ref: task-109
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant