[v0.8] Add semantic reviewer agent pack for evidence and release risk rating

Parent: #89

Depends on: #91

## Summary

Add a semantic reviewer agent pack that rates evidence and release risks using fixed rubrics, producing `output/intermediate/semantic_review_report.json`.

The agents judge; they do not edit. Python validates and later converts judgments into evidence/release blockers.

## Motivation

Several v0.8 risks are semantic and cannot be robustly solved by deterministic code alone:

- whether a source authority is sufficient for a claim type;
- whether a source actually supports the claim or only part of it;
- whether fund-flow / price / inventory metrics mix incompatible scopes;
- whether a legal/policy/company-event claim lacks official source coverage;
- whether institutional branding or confidential labels lack authorization context.

MABW should use agents for rubric-based review, but keep authority in deterministic schemas and policy packs.

## Proposed agent files

Add Claude Code agent prompts first, then mirror to other runtimes if needed:

```text
.claude/agents/source-authority-judge.md
.claude/agents/source-support-judge.md
.claude/agents/metric-scope-judge.md
.claude/agents/official-source-coverage-judge.md
.claude/agents/branding-authorization-judge.md
.claude/agents/release-committee-judge.md
```

Later runtime parity may mirror to:

```text
.opencode/agents/
.codex/agents/
.agents/skills/
```

## Inputs

Judges may read:

```text
output/intermediate/audited_brief.md
output/intermediate/claim_ledger.json
output/source_appendix.md
output/delivery/brief.md
config.yaml
```

Judges must not mutate:

```text
output/intermediate/audited_brief.md
output/intermediate/claim_ledger.json
output/delivery/*
```

## Output

All judges contribute to:

```text
output/intermediate/semantic_review_report.json
```

Output must satisfy the contract from #91.

## Required judge roles

### 1. source-authority-judge
Rates whether each source authority fits the claim category.

Example finding:

```json
{
  "judge_id": "source_authority_judge",
  "finding_type": "source_authority_insufficient",
  "claim_category": "legal_trade_remedy",
  "source_authority": "reputable_financial_media",
  "required_authority_for_mode": "official_legal_regulatory",
  "rating": 1,
  "rating_label": "insufficient_for_formal_release",
  "verification_path": "Attach official legal/regulatory text."
}
```

### 2. source-support-judge
Rates whether the cited source/evidence directly supports the claim.

Must identify:

- supported parts;
- unsupported or overbroad parts;
- recommended rewrite if the claim should be narrowed.

### 3. metric-scope-judge
Rates whether market metrics have comparable scope.

Must check:

- provider;
- universe;
- time window;
- unit;
- classification system;
- calculation method where available.

Typical blocker: multiple A-share fund-flow numbers are cited but come from different providers/universes/time windows.

### 4. official-source-coverage-judge
Rates whether legal, policy, company-event, exchange, and official statistics claims have appropriate official coverage.

Must distinguish:

- official text present;
- media source only;
- market expectation;
- latest official check missing.

### 5. branding-authorization-judge
Rates whether institution name, confidential/internal labels, or formal-distribution wording appears without authorization context.

Typical blocker:

```text
Confidential — Internal Use Only
<Institution> research weekly
for <Institution> research use
```

### 6. release-committee-judge
Aggregates upstream semantic findings only. It must not re-litigate every claim or override Python policy packs.

## Prompt guardrails

Every judge prompt must include:

```text
You are a reviewer, not an editor.
Do not modify the brief, Claim Ledger, delivery files, or source files.
Return schema-valid JSON only.
Use the rubric.
If uncertain, choose the lower release eligibility and require human review.
Your finding is evidence for downstream policy; it is not final publication authority.
```

## Rating scale

Use 0-4:

```yaml
4: strong / required authority present / directly supported
3: usable, minor caveat
2: draft-usable but not research/formal-release ready
1: insufficient for the selected mode
0: unsupported, misleading, or unauthorized
```

## Acceptance criteria

- [ ] Agent prompts exist for all six judge roles.
- [ ] Prompts explicitly forbid editing content artifacts.
- [ ] Prompts require schema-valid JSON.
- [ ] Prompts include rating scale and judge-specific rubric.
- [ ] Prompts require `verification_path` for every warning/blocker.
- [ ] Prompts state that final authority remains deterministic policy + human approval.
- [ ] At least one synthetic semantic review fixture validates through the contract from #91.

## Non-goals

- Do not implement automatic fact proof.
- Do not implement source recrawl.
- Do not let reviewer agents write evidence_report or release_readiness_report.
- Do not change final report content in this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v0.8] Add semantic reviewer agent pack for evidence and release risk rating #92

Summary

Motivation

Proposed agent files

Inputs

Output

Required judge roles

1. source-authority-judge

2. source-support-judge

3. metric-scope-judge

4. official-source-coverage-judge

5. branding-authorization-judge

6. release-committee-judge

Prompt guardrails

Rating scale

Acceptance criteria

Non-goals

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[v0.8] Add semantic reviewer agent pack for evidence and release risk rating #92

Description

Summary

Motivation

Proposed agent files

Inputs

Output

Required judge roles

1. source-authority-judge

2. source-support-judge

3. metric-scope-judge

4. official-source-coverage-judge

5. branding-authorization-judge

6. release-committee-judge

Prompt guardrails

Rating scale

Acceptance criteria

Non-goals

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions