feat: storyboard coverage and comply() scenario filtering#1985
Merged
feat: storyboard coverage and comply() scenario filtering#1985
Conversation
This was referenced Apr 8, 2026
bokelley
added a commit
that referenced
this pull request
Apr 8, 2026
Remove storyboard YAMLs (brand_rights, capability_discovery) and brand compliance track — these belong in #1985 which covers full storyboard coverage and comply() migration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
Apr 8, 2026
* fix: storyboard validation failures against training agent - Add creative and account capability blocks to get_adcp_capabilities response so capability_discovery validation passes (#1990) - Increase training agent rate limit from 60 to 300 req/min to prevent cascading failures during bulk storyboard evaluation (#1991, #1992, #1994) - Add brand_rights and capability_discovery storyboard YAMLs (#1993, #1992) - Add brand compliance track to TRACK_SCENARIOS infrastructure (#1993) - Update storyboard test to accept core/ and brand/ schema_ref prefixes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review and security review findings - Update stale rate limit comment (60→300 req/min) in task TTL calculation - Add 'brand' to supported_protocols since training agent implements brand tools - Update test expectation for supported_protocols Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: training agent schema compliance for creative handlers - Add created_date and updated_date to list_creatives response (required by schema) - Add include_snapshot param to list_creatives tool schema and return snapshot_unavailable_reason when requested - Fix build_creative assets format: use object map with content field instead of array with html field (matches creative-manifest.json schema) - Add generative build mode (target_format_id only, no manifest/library) since training agent declares supports_generation: true - Fix preview_creative render format: use render_id, output_format, role, preview_url/preview_html per preview-render.json schema Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review and security review findings (round 2) - Fix preview_creative response: add response_type, preview_id, input fields per preview-creative-response.json schema - Fix preview_creative: reject invalid format_ids with INVALID_FORMAT error - Fix BuildCreativeArgs/PreviewCreativeArgs: type assets as Record (object map) matching creative-manifest.json schema, not Array - Cap target_format_ids at 50 to prevent response amplification - Remove non-schema synced_at field from list_creatives response Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: scope PR to training agent fixes only Remove storyboard YAMLs (brand_rights, capability_discovery) and brand compliance track — these belong in #1985 which covers full storyboard coverage and comply() migration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: type training agent responses with @adcp/client types Import BuildCreativeResponse, ListCreativesResponse, PreviewCreativeResponse, and CreativeManifest from @adcp/client and use them as return types. The compiler now catches shape mismatches at build time: - Removed creative_id from CreativeManifest (not in schema) - Changed multi-format response from { results } to { creative_manifests } - buildHtmlAssets returns AdcpCreativeManifest['assets'] type Previously, handlers returned untyped object literals — the MCP SDK wraps them in content[0].text as opaque strings, so TypeScript couldn't validate the domain payload shape. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add five new storyboards and fix comply() performance for storyboard evaluations. Storyboard coverage: - capability_discovery: get_adcp_capabilities validation (#429) - campaign_governance_denied: hard denial with no escalation (#430) - campaign_governance_conditions: conditional approval flow (#430) - campaign_governance_delivery: delivery drift monitoring (#430) - creative_lifecycle: multi-format sync, list, build, preview (#431) Comply performance (#433): - extractScenariosFromStoryboard() extracts comply_scenario values from YAML - filterToKnownScenarios() validates scenario names against TRACK_SCENARIOS - Storyboard run endpoint now passes only referenced scenarios to comply(), reducing calls from 30+ to 2-5 per storyboard evaluation Track changes: - Add campaign_governance track with 5 scenarios - Add creative_lifecycle to creative track Closes #429, #430, #431, #433. Partial progress on #432. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Five new storyboards covering uncovered protocol domains: - social_platform: accounts, audiences, native creatives, events, financials - si_session: SI offering discovery, session lifecycle - brand_rights: identity, rights licensing, creative approval - property_governance: property list CRUD, delivery validation - content_standards: standards CRUD, calibration, delivery validation comply() migration from track-based to storyboard-based routing: - PLATFORM_STORYBOARDS maps each platform type to recommended storyboards - comply() accepts storyboards option (priority: scenarios > storyboards > tracks) - evaluate_agent_quality uses PLATFORM_STORYBOARDS when no explicit tracks - Heartbeat job uses storyboard routing when agent has a registered platform type - Compare endpoint now filters to storyboard scenarios Fixed comply_scenario annotations across all original storyboards: - Replaced phantom names (account_setup, governance_setup, media_buy_flow) with real TestScenario values (full_sales_flow, create_media_buy, etc.) - Added comply_scenario to all 7 storyboards that had none - 20/21 storyboards now have test coverage (brand_rights pending #1993) Validation against training agent filed as issues: - #1990: capability_discovery creative validation failure - #1991: creative_ad_server SSE/rate limit errors - #1992: 15 storyboards skip (agent tool discovery gap) - #1993: Brand protocol needs @adcp/client scenarios - #1994: Bulk validation needs backoff Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fc5f3be to
b5ee24c
Compare
This was referenced Apr 8, 2026
The capability_discovery scenario fails with "Tools suggest unreported protocols: compliance (has: [comply_test_controller])" because the training agent exposes comply_test_controller but doesn't declare compliance in supported_protocols. Task retrieval failures (callToolStream over stateless HTTP) tracked separately in adcontextprotocol/adcp-client#442. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Picks up the serve() task store fix (adcp-client#443) and skip counting fix (adcp-client#441). Local validation shows 20/20 storyboards passing against the training agent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rsion) Keep storyboard routing + @adcp/client 4.22.1 from our branch. Add userAgent and outbound request logging from main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
Apr 8, 2026
…efinitions Add 4 storyboards that define protocol compliance requirements: - schema_validation: response schema conformance + temporal constraints - behavioral_analysis: brief filtering, response consistency, pricing edge cases - error_compliance: error codes, recovery hints, transport bindings (L1/L2/L3) - media_buy_state_machine: state transitions + terminal state enforcement These were developed in @adcp/client but describe agent behavior requirements, not client internals. The adcp repo is the canonical source for protocol storyboards; @adcp/client will pull from here. Also updates PLATFORM_STORYBOARDS to include the new storyboards for all sales platforms (schema_validation, behavioral_analysis, error_compliance, media_buy_state_machine where applicable). 25 storyboards total. brand_rights already existed from #1985. deterministic_testing stays in @adcp/client (test harness, not protocol). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
Apr 8, 2026
* feat: promote 4 protocol storyboards from @adcp/client to canonical definitions Add 4 storyboards that define protocol compliance requirements: - schema_validation: response schema conformance + temporal constraints - behavioral_analysis: brief filtering, response consistency, pricing edge cases - error_compliance: error codes, recovery hints, transport bindings (L1/L2/L3) - media_buy_state_machine: state transitions + terminal state enforcement These were developed in @adcp/client but describe agent behavior requirements, not client internals. The adcp repo is the canonical source for protocol storyboards; @adcp/client will pull from here. Also updates PLATFORM_STORYBOARDS to include the new storyboards for all sales platforms (schema_validation, behavioral_analysis, error_compliance, media_buy_state_machine where applicable). 25 storyboards total. brand_rights already existed from #1985. deterministic_testing stays in @adcp/client (test harness, not protocol). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: add empty changeset Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review — consistent platform mappings and per-storyboard tests - Add media_buy_state_machine to all media-buy-capable platforms (retail_media, search_platform, audio_platform, ai_ad_network, social_platform) - Add behavioral_analysis and error_compliance to ai_platform - Add per-storyboard test blocks for schema_validation, behavioral_analysis, error_compliance, and media_buy_state_machine Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add track, required_tools, platform_types to all storyboards All 25 storyboards now include: - track: compliance track for grouping results (core, products, media_buy, etc.) - required_tools: tools the agent must advertise for the storyboard to run - platform_types: (where applicable) which platform types this storyboard applies to These fields are required by @adcp/client's loader functions (getStoryboardsForPlatformType, getComplianceStoryboardsForTrack). Without them, the client cannot filter or group storyboards. Also updates the Storyboard TypeScript interface to include the new fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: mark PLATFORM_STORYBOARDS as deprecated This mapping will move to @adcp/client once adcp-client#445 ships storyboard-based comply(). Callers will pass platform_type to comply() and let the client resolve storyboards internally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
Apr 8, 2026
Integrates scenario filtering from #1985 with two-phase comply(): - Phase 1 filters scenarios to foundation tracks only - Phase 2 filters to remaining tracks - Both respect the storyboard-specific scenario list when provided Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
Apr 8, 2026
The two-phase approach (run core+products first, then skip dependent tracks) is no longer needed: #1985's scenario filtering already limits comply() to only the scenarios a storyboard references, and @adcp/client's storyboard runner will handle step-by-step short-circuiting. Remove isTrackProductDependent(), foundation/remaining track splitting, and the product-availability check. comply() now runs all requested scenarios in a single testAllScenarios() call. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
Apr 8, 2026
* feat: storyboard UX improvements and OAuth auth support Short-circuit comply() when product discovery fails — skip dependent tracks instead of repeating the same error across 10+ scenarios. Saves significant time for agents with broken product schemas. Add inline agent connection form to dashboard — replaces the chat bounce for saving auth tokens and platform type. New PUT /registry/agents/:url/connect endpoint handles token storage directly. Filter storyboard picker by agent capabilities — groups into Recommended and Other based on the agent's compliance tracks. Signal storyboards show for signal agents, creative for creative agents, etc. Add creative_generative.yaml storyboard — covers the brief-driven generation flow (OpenAds, generative DSPs) with 5 phases: format discovery, generate from brief, refine, multi-format, production build. Fix resolveOwnerAuth() to fall back to OAuth tokens when no static bearer token exists, with 5-minute expiration buffer. Auto-scroll storyboard results into view after rendering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: add changeset for storyboard UX improvements Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review findings - Add max-length validation (4096) on auth_token input - Validate auth_type against allowed values instead of silent fallback - Validate platform_type is a string before set membership check - Combine ownership verification and org lookup into single query - Log warning when OAuth token has no expiration recorded - Preserve agentTracks context through storyboard back button navigation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add rate limiter to agent connect endpoint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: collapse two-phase comply() into single pass The two-phase approach (run core+products first, then skip dependent tracks) is no longer needed: #1985's scenario filtering already limits comply() to only the scenarios a storyboard references, and @adcp/client's storyboard runner will handle step-by-step short-circuiting. Remove isTrackProductDependent(), foundation/remaining track splitting, and the product-availability check. comply() now runs all requested scenarios in a single testAllScenarios() call. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove unused variable flagged by CodeQL Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
capability_discovery.yamlstoryboard coveringget_adcp_capabilities— the only task that had no storyboard coveragedenied,conditions,delivery) with a newcampaign_governancecompliance trackcreative_lifecycle.yamlcovering multi-format sync, library listing, build, and previewChanges
New storyboards (5):
capability_discovery.yaml— protocol-level capability introspectioncampaign_governance_denied.yaml— hard denial, no escalationcampaign_governance_conditions.yaml— conditional approval with binding conditionscampaign_governance_delivery.yaml— delivery monitoring with drift re-checkcreative_lifecycle.yaml— sync 3 formats, list/filter, build/previewComply performance fix:
extractScenariosFromStoryboard()extractscomply_scenariovalues from storyboard YAMLfilterToKnownScenarios()validates extracted names againstTRACK_SCENARIOScomply()Track changes:
campaign_governancetrack with 5 scenarioscreative_lifecycleto existingcreativetrackTest plan
protocol/refsextractScenariosFromStoryboardtested for dedup and empty casesCloses #429, #430, #431, #433. Partial progress on #432.
🤖 Generated with Claude Code