Skip to content

Cross-Pipeline A/B Testing Framework for Intelligence Feature Validation #486

@github-actions

Description

@github-actions

Strategic Improvement

Config has ab_test_ratio=0.2 but no implementation. Build A/B test harness that routes pipelines to variant configurations (template, model routing, iteration counts), tracks outcomes per variant, and computes statistical significance. Accelerates data-driven optimization and validates intelligence improvements.

Acceptance Criteria

  • A/B test registry in daemon-config.json defines active experiments with variant configs
  • Pipeline spawn routes to control/variant based on ab_test_ratio with stable hashing (same issue always same variant)
  • Outcomes tracked per variant: success rate, duration, cost, DORA metrics
  • CLI command shipwright ab-test report <experiment-id> shows variant comparison with confidence intervals
  • Auto-graduate winning variant after significance threshold (p<0.05, n>30 per arm)
  • Integration with intelligence engine to suggest experiments based on optimization opportunities

Context

  • Priority: P2
  • Complexity: standard
  • Generated by: Strategic Intelligence Agent
  • Strategy alignment: P2: Intelligence & Learning

Metadata

Metadata

Assignees

No one assigned

    Labels

    auto-patrolCreated by autonomous patrol agentsready-to-buildIssue is ready for autonomous pipeline processingstrategicCreated by strategic intelligence agent

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions