Strategic Improvement
Config has ab_test_ratio=0.2 but no implementation. Build A/B test harness that routes pipelines to variant configurations (template, model routing, iteration counts), tracks outcomes per variant, and computes statistical significance. Accelerates data-driven optimization and validates intelligence improvements.
Acceptance Criteria
- A/B test registry in daemon-config.json defines active experiments with variant configs
- Pipeline spawn routes to control/variant based on ab_test_ratio with stable hashing (same issue always same variant)
- Outcomes tracked per variant: success rate, duration, cost, DORA metrics
- CLI command
shipwright ab-test report <experiment-id> shows variant comparison with confidence intervals
- Auto-graduate winning variant after significance threshold (p<0.05, n>30 per arm)
- Integration with intelligence engine to suggest experiments based on optimization opportunities
Context
- Priority: P2
- Complexity: standard
- Generated by: Strategic Intelligence Agent
- Strategy alignment: P2: Intelligence & Learning
Strategic Improvement
Config has
ab_test_ratio=0.2but no implementation. Build A/B test harness that routes pipelines to variant configurations (template, model routing, iteration counts), tracks outcomes per variant, and computes statistical significance. Accelerates data-driven optimization and validates intelligence improvements.Acceptance Criteria
shipwright ab-test report <experiment-id>shows variant comparison with confidence intervalsContext