Skip to content

test: A2A ITK interoperability harness + baseline (no SDK changes)#35

Open
zeroasterisk wants to merge 1 commit into
actioncard:mainfrom
zeroasterisk:itk-harness-showcase
Open

test: A2A ITK interoperability harness + baseline (no SDK changes)#35
zeroasterisk wants to merge 1 commit into
actioncard:mainfrom
zeroasterisk:itk-harness-showcase

Conversation

@zeroasterisk
Copy link
Copy Markdown

ITK Interoperability Harness + Baseline (showcase — no SDK changes)

This PR adds a test-harness that drives the unmodified A2A Elixir SDK against the official A2A Interoperability Test Kit (ITK), plus an honest capability/gap baseline report. It's a measuring stick, not a fix — zero changes to lib/.

Why

Establish a reproducible baseline so future v1.0-compliance work is gap-driven and regression-gated, and so reviewers can see exactly what the SDK can/can't do against a real A2A client today.

What's added (harness + docs only)

  • test/support/itk/instruction.ex — ITK Instruction protobuf codec
  • test/support/itk/agent.ex — JSON-RPC handler / instruction interpreter
  • test/itk/server.exs — standalone Bandit server (v0.3 card, JSON-RPC, SSE)
  • test/itk/{instruction,agent}_test.exs + binary fixtures
  • docs/ITK_BASELINE.md — full capability/gap report with reproducible evidence

git diff --stat origin/main -- lib/ is empty (verified).

Baseline summary

✅ Works (404 unit tests green against pristine SDK):

  • v0.3-shaped agent card (preferredTransport: JSONRPC, protocolVersion: 0.3.0)
  • ITK Instruction proto decode (return_response, steps, call_agent)
  • Interpreter + Task construction; non-streaming message/send round-trip

❌ Gaps (vs a2a-sdk 0.3.24 Python client — documented, not fixed here):

  • JSON-RPC enum encoding: SDK emits ROLE_AGENT / TASK_STATE_COMPLETED (proto-style); v0.3 client expects agent / completed
  • SSE streaming event envelopes: Task snapshot vs the TaskStatusUpdateEvent / TaskArtifactUpdateEvent union (missing taskId/kind)
  • Card shape: SDK's native encode_agent_card emits v1.0 supportedInterfaces; v0.3 client wants preferredTransport/additionalInterfaces

Prioritized v1.0 gap list (seeds future, separate PRs)

  1. JSON-RPC enum encoding (highest leverage — blocks every traversal)
  2. Streaming event envelopes
  3. v0.3 agent-card emission
  4. gRPC / REST transports (deferred)

Reproduce

mix test                                              # 404 + 2 doctests, 0 failures
MIX_ENV=test mix run test/itk/server.exs --httpPort 10130
# + ITK driver: uv run --no-sources python itk_baseline_elixir.py

Note: the standalone server must run under MIX_ENV=test (harness modules live in test/support/, only on elixirc_paths in test env).

Adds a test-harness that drives the unmodified A2A Elixir SDK against the
official A2A Interoperability Test Kit (ITK), plus an honest capability/gap
baseline report. This is a measuring-stick PR — no changes to lib/.

Harness:
- test/support/itk/instruction.ex  ITK Instruction protobuf codec
- test/support/itk/agent.ex         JSON-RPC handler / instruction interpreter
- test/itk/server.exs               standalone Bandit server (v0.3 card, JSON-RPC, SSE)
- test/itk/{instruction,agent}_test.exs + fixtures

Report (docs/ITK_BASELINE.md) documents, with reproducible evidence:
- WORKS: v0.3 agent card, proto decode, interpreter, non-streaming message/send (404 tests green)
- GAPS:  JSON-RPC enum encoding (ROLE_AGENT/TASK_STATE_* vs v0.3 agent/completed),
         SSE streaming event envelopes, v1.0 vs v0.3 card shape
- Prioritized v1.0 gap list seeding future (separate) SDK work

Reference client: a2a-sdk 0.3.24.
@zeroasterisk
Copy link
Copy Markdown
Author

The plan is: Land this PR with dustin for v0.3 and 1.0. then upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant