# Implement ASPIC+ Argumentation Semantics in Propstore
## Context
The propstore is a propositional knowledge store that manages scientific claims extracted from research papers. Claims can conflict (different values for the same concept under the same conditions), and the system needs to determine which claims survive scrutiny.
Currently, conflict resolution uses a naive weighted-score approach (\_resolve\_stance in propstore/world/resolution.py). This must be replaced with formal argumentation semantics grounded in Dung's abstract argumentation framework and the ASPIC+ structured argumentation framework.
The theoretical foundation is in papers/Modgil\_2018\_GeneralAccountArgumentationPreferences/notes.md. Read this file first — it contains the definitions, theorems, and design rationale you need. The key references in the collection are:
- papers/Dung\_1995\_AcceptabilityArguments/ — Dung's original abstract argumentation (grounded/preferred/stable extensions)
- papers/Modgil\_2018\_GeneralAccountArgumentationPreferences/ — ASPIC+ with preferences (THIS IS YOUR PRIMARY REFERENCE)
- papers/Modgil\_2014\_ASPICFrameworkStructuredArgumentation/ — Tutorial introduction to ASPIC+
- papers/Pollock\_1987\_DefeasibleReasoning/ — Rebutting vs undercutting defeaters
Read the notes.md for each before writing code.
## What Exists Now
### Stance graph (the attack/support relation)
The claim\_stance table stores pairwise epistemic relationships between claims:
CREATE TABLE claim\_stance (
  claim\_id TEXT NOT NULL, -- the source of the attack/support
  target\_claim\_id TEXT NOT NULL, -- the target
  stance\_type TEXT NOT NULL, -- rebuts|undercuts|undermines|supports|explains|supersedes
  strength TEXT, -- strong|moderate|weak
  conditions\_differ TEXT,
  note TEXT,
  resolution\_method TEXT, -- nli\_first\_pass|nli\_second\_pass
  resolution\_model TEXT,
  embedding\_model TEXT,
  embedding\_distance REAL,
  pass\_number INTEGER,
  confidence REAL
);
**Stance types map to ASPIC+ attack types as follows:**
| Propstore stance | ASPIC+ attack type | Preference-dependent? | Reference |
|---|---|---|---|
| undermines | Undermining (Def 8.i, p.11) | Yes — fails if attacker is strictly weaker | Def 9, p.12 |
| rebuts | Rebutting (Def 8.ii, p.11) | Yes — fails if attacker is strictly weaker | Def 9, p.12 |
| undercuts | Undercutting (Def 8.iii, p.11) | **No — always succeeds** | Def 9, p.12 |
| supports | Not in ASPIC+ (bipolar extension) | N/A | Cayrol 2005 |
| explains | Not in ASPIC+ (bipolar extension) | N/A | Cayrol 2005 |
| supersedes | Special case: unconditional defeat | No — always succeeds | Propstore-specific |
### Current resolution (propstore/world/resolution.py)
The \_resolve\_stance function sums weighted scores across stance types and picks the claim with the highest net score. This has no formal backing. The full resolution dispatch is in resolve(), which selects between strategies: recency, sample\_size, stance, override.
### Conflict detection (propstore/conflict\_detector.py)
The conflict detector identifies pairs of claims that bind to the same concept with incompatible values. It classifies pairs as CONFLICT, PHI\_NODE, OVERLAP, or PARAM\_CONFLICT. This module does NOT need to change — it detects conflicts. The argumentation module resolves them.
### World model (propstore/world/)
- model.py — WorldModel: read-only access to the sidecar SQLite
- bound.py — BoundWorld: condition-bound view, filters active claims
- resolution.py — resolve() and strategy implementations
- hypothetical.py — HypotheticalWorld: in-memory overlay
- types.py — dataclasses and protocols
## What to Build
### 1. Dung Extension Computation (propstore/dung.py)
New module. Implements Dung's abstract argumentation semantics over a directed attack graph.
**Input:** A set of arguments (claim IDs) and a defeat relation (directed edges where attacks survive preference filtering).
**Output:** Extensions under different semantics.
@dataclass
class ArgumentationFramework:
  """Dung's abstract argumentation framework AF = (Args, Defeats)."""
  arguments: set\[str] # claim IDs
  defeats: dict\[str, set\[str]] # arg -> set of args it defeats
def grounded\_extension(af: ArgumentationFramework) -> set\[str]:
  """Compute the unique grounded extension.
  The grounded extension is the least fixed point of the characteristic
  function F(S) = {A | A is defended by S}.
  Start with S = unattacked arguments. Iterate: add arguments defended
  by S. Stop when fixed point reached.
  Reference: Dung 1995, Definition 20 + Theorem 25.
  """
def preferred\_extensions(af: ArgumentationFramework) -> list\[set\[str]]:
  """Compute all preferred extensions.
  A preferred extension is a maximal (w.r.t. set inclusion) admissible set.
  An admissible set is conflict-free and defends all its members.
  Reference: Dung 1995, Definition 8.
  """
def stable\_extensions(af: ArgumentationFramework) -> list\[set\[str]]:
  """Compute all stable extensions.
  A stable extension is conflict-free and defeats every argument not in it.
  Reference: Dung 1995, Definition 12.
  WARNING: Stable extensions may not exist. Preferred always exist.
  """**Implementation notes:**
- The grounded extension is polynomial (iterate characteristic function to fixed point).
- Preferred/stable extension enumeration is coNP-complete in general. For propstore's scale (hundreds to low thousands of claims), brute-force enumeration with pruning is acceptable. If this becomes a bottleneck, implement the algorithm from Nofal et al. 2014 ("Algorithms for argumentation semantics"). But don't optimize prematurely — get it correct first.
- The defeats dict is the FILTERED attack relation (after preference ordering converts attacks to defeats per Def 9, p.12). Raw attacks where the attacker is strictly weaker are excluded.
### 2. Preference Ordering (propstore/preference.py)
New module. Computes the preference ordering over claims that determines which attacks succeed as defeats.
**The core operation:** Given two claims A and B where A attacks B, determine whether A is "not strictly weaker than" B. If A IS strictly weaker, the attack fails (does not become a defeat). If A is not strictly weaker, the attack succeeds as a defeat.
Implement BOTH preference principles from Modgil & Prakken 2018:
#### Last-link (Def 20, p.21)
Compare arguments by their most recent defeasible inference step. In propstore terms: compare claims by the quality of the methodology/source that produced them.
def last\_link\_preference(claim\_a: dict, claim\_b: dict, set\_comparison: str = "elitist") -> bool:
  """Return True if claim\_a is strictly preferred to claim\_b under last-link.
  Last-link compares the 'last defeasible rules' — in propstore terms,
  the immediate source/methodology of the claim.
  Ordering signals (highest to lowest within each axis):
  - Methodology: direct measurement > derived/computed > estimated > assumed
  - Publication recency (as proxy for methodology currency)
  - Journal impact / peer review status
  Reference: Def 20, p.21
  """#### Weakest-link (Def 21, p.21)
Compare arguments by their weakest component. In propstore terms: a claim is only as strong as its weakest input — the least reliable premise or the least rigorous inference step.
def weakest\_link\_preference(claim\_a: dict, claim\_b: dict, set\_comparison: str = "elitist") -> bool:
  """Return True if claim\_a is strictly preferred to claim\_b under weakest-link.
  Weakest-link considers ALL components: premises AND inference steps.
  A claim derived from 10 excellent sources but one terrible one is
  weaker than a claim derived from 5 good sources.
  Ordering signals:
  - sample\_size (larger is stronger)
  - uncertainty/uncertainty\_type (smaller uncertainty is stronger)
  - provenance quality (peer-reviewed > preprint > informal)
  - source paper recency
  Reference: Def 21, p.21
  """#### Set comparison (Def 19, p.21)
Both principles use a set comparison operator. Implement both options:
def elitist\_comparison(set\_a: list\[float], set\_b: list\[float]) -> bool:
  """Elitist: set\_a < set\_b iff EXISTS x in set\_a such that FORALL y in set\_b, x < y.
  One bad apple spoils the barrel — if any element of set\_a is worse
  than ALL elements of set\_b, set\_a loses.
  Reference: Def 19, p.21
  """
def democratic\_comparison(set\_a: list\[float], set\_b: list\[float]) -> bool:
  """Democratic: set\_a < set\_b iff FORALL x in set\_a, EXISTS y in set\_b, x < y.
  Every element of set\_a is beaten by some element of set\_b.
  Reference: Def 19, p.21
  """#### Mapping claim metadata to ordinal strength
The preference ordering operates on an ordinal scale. Define a function that maps claim metadata to a comparable strength value:
def claim\_strength(claim: dict) -> float:
  """Compute a scalar strength for preference ordering.
  Compose from available metadata:
  - sample\_size: log-scaled (sample of 1000 vs 10 matters; 10000 vs 1000 less so)
  - uncertainty: inverse (lower uncertainty = stronger)
  - recency: year from provenance, normalized
  - confidence: from the stance classification (NLI pass confidence)
  Claims with more metadata dimensions populated are NOT automatically stronger.
  Missing metadata = no opinion on that axis, not weakness.
  Return a float where higher = stronger. The absolute scale doesn't matter,
  only the ordering.
  """**Critical design decision:** The preference ordering must satisfy the "reasonable" criterion (Def 18, p.16-17):
1. Strict+firm arguments (claims from axiom premises with only strict inference) dominate all others
2. Strict+firm arguments are never dominated by anything
3. Strict continuation preserves preference
In propstore terms: claims with conditions: \[] (unconditional) derived from well-established sources with large sample sizes and low uncertainty should be treated as near-axiomatic. Claims with conditions, small samples, or high uncertainty are defeasible.
### 3. Attack-to-Defeat Conversion (propstore/argumentation.py)
New module. The bridge between the stance graph (raw attacks) and the argumentation framework (filtered defeats).
def build\_argumentation\_framework(
  conn: sqlite3.Connection,
  active\_claim\_ids: set\[str],
  preference\_principle: str = "last\_link", # or "weakest\_link"
  set\_comparison: str = "elitist", # or "democratic"
) -> ArgumentationFramework:
  """Build a Dung AF from the stance graph filtered through preferences.
  Steps:
  1. Load all stances between active claims from claim\_stance table
  2. For each attack stance (rebuts, undermines):
  - Compute preference ordering between attacker and target
  - If attacker is NOT strictly weaker: attack becomes defeat (include edge)
  - If attacker IS strictly weaker: attack fails (exclude edge)
  (Reference: Def 9, p.12)
  3. For each undercutting stance:
  - ALWAYS include as defeat (preference-independent)
  (Reference: Def 9, p.12)
  4. For each supersedes stance:
  - ALWAYS include as defeat (propstore-specific, unconditional)
  5. Support/explains stances are NOT attacks — exclude from defeat relation
  (These could feed a bipolar framework in the future but not now)
  6. Return ArgumentationFramework(arguments=active\_claim\_ids, defeats=defeat\_edges)
  """
def compute\_justified\_claims(
  conn: sqlite3.Connection,
  active\_claim\_ids: set\[str],
  semantics: str = "grounded", # or "preferred" or "stable"
  preference\_principle: str = "last\_link",
  set\_comparison: str = "elitist",
) -> set\[str] | list\[set\[str]]:
  """End-to-end: build AF, compute extensions, return justified claim IDs.
  For grounded semantics: returns a single set (the unique grounded extension).
  For preferred/stable: returns a list of sets (multiple possible extensions).
  """### 4. Integration with Resolution (propstore/world/resolution.py)
Replace \_resolve\_stance with the argumentation-based resolver:
def \_resolve\_argumentation(
  claims: list\[dict],
  world: WorldModel,
  semantics: str = "grounded",
  preference\_principle: str = "last\_link",
  set\_comparison: str = "elitist",
) -> tuple\[str | None, str | None]:
  """Resolve a conflicted concept using ASPIC+ argumentation semantics.
  1. Get active claim IDs for this concept
  2. Build ArgumentationFramework from stance graph + preferences
  3. Compute extension(s) under chosen semantics
  4. If exactly one claim for this concept survives in the extension: winner
  5. If multiple survive: still conflicted (but reduced conflict set)
  6. If none survive: underdetermined
  Returns (winning\_claim\_id, reason\_string).
  """Add a new ResolutionStrategy:
class ResolutionStrategy(Enum):
  RECENCY = "recency"
  SAMPLE\_SIZE = "sample\_size"
  STANCE = "stance" # legacy naive scoring — KEEP for backward compat
  OVERRIDE = "override"
  ARGUMENTATION = "argumentation" # NEW: formal ASPIC+ semanticsWire it into resolve() alongside the existing strategies. Do NOT remove the existing stance strategy — keep it for backward compatibility but document it as deprecated.
### 5. CLI Integration (propstore/cli/compiler\_cmds.py)
Add --semantics, --preference, and --set-comparison options to world resolve:
pks world resolve concept1 domain=example --strategy argumentation --semantics grounded --preference last\_link
pks world resolve concept1 domain=example --strategy argumentation --semantics preferred --preference weakest\_link --set-comparison democratic
Add a new world extensions command:
pks world extensions \[--semantics grounded|preferred|stable] \[--preference last\_link|weakest\_link] \[--set-comparison elitist|democratic] \[bindings...]
This should show the full extension(s) — all claims that survive — not just resolution of a single concept. This is the "what does the world look like if we accept this argumentation framework?" view.
### 6. Sidecar Schema Extension (propstore/build\_sidecar.py)
Add a defeat table to cache the computed defeat relation:
CREATE TABLE defeat (
  attacker\_id TEXT NOT NULL,
  target\_id TEXT NOT NULL,
  attack\_type TEXT NOT NULL, -- rebuts|undermines|undercuts|supersedes
  preference\_principle TEXT NOT NULL, -- last\_link|weakest\_link
  set\_comparison TEXT NOT NULL, -- elitist|democratic
  attacker\_strength REAL,
  target\_strength REAL,
  FOREIGN KEY (attacker\_id) REFERENCES claim(id),
  FOREIGN KEY (target\_id) REFERENCES claim(id)
);
This is a materialized view — recomputed on pks build. The extension computation queries this table rather than recomputing preferences on every query.
## Testing Strategy
### Unit tests for Dung semantics (tests/test\_dung.py)
Test against known examples from Dung 1995:
def test\_grounded\_nixon\_diamond():
  """Nixon diamond: A attacks B, B attacks A. Grounded = empty set."""
  af = ArgumentationFramework(
  arguments={"A", "B"},
  defeats={"A": {"B"}, "B": {"A"}},
  )
  assert grounded\_extension(af) == set()
def test\_grounded\_unattacked\_wins():
  """A attacks B, nothing attacks A. Grounded = {A}."""
  af = ArgumentationFramework(
  arguments={"A", "B"},
  defeats={"A": {"B"}},
  )
  assert grounded\_extension(af) == {"A"}
def test\_preferred\_nixon\_diamond():
  """Nixon diamond has two preferred extensions: {A} and {B}."""
  af = ArgumentationFramework(
  arguments={"A", "B"},
  defeats={"A": {"B"}, "B": {"A"}},
  )
  exts = preferred\_extensions(af)
  assert len(exts) == 2
  assert {"A"} in exts
  assert {"B"} in exts
def test\_stable\_may\_not\_exist():
  """A attacks B, B attacks C, C attacks A. No stable extension."""
  af = ArgumentationFramework(
  arguments={"A", "B", "C"},
  defeats={"A": {"B"}, "B": {"C"}, "C": {"A"}},
  )
  assert stable\_extensions(af) == \[]
def test\_grounded\_reinstatement():
  """A attacks B, B attacks C. A reinstates C. Grounded = {A, C}."""
  af = ArgumentationFramework(
  arguments={"A", "B", "C"},
  defeats={"A": {"B"}, "B": {"C"}},
  )
  assert grounded\_extension(af) == {"A", "C"}
def test\_grounded\_floating\_acceptance():
  """A attacks B, A attacks C, B attacks C, C attacks B.
  Grounded = {A}. C and B both defeated by A."""
  af = ArgumentationFramework(
  arguments={"A", "B", "C"},
  defeats={"A": {"B", "C"}, "B": {"C"}, "C": {"B"}},
  )
  ext = grounded\_extension(af)
  assert "A" in ext
  assert "B" not in ext
  assert "C" not in ext### Unit tests for preference ordering (tests/test\_preference.py)
def test\_elitist\_comparison():
  """Elitist: {1,5} < {3,4} because 1 < 3 and 1 < 4."""
  assert elitist\_comparison(\[1, 5], \[3, 4]) is True
def test\_democratic\_comparison():
  """Democratic: {1,2} < {3,4} because 1<3 and 2<4."""
  assert democratic\_comparison(\[1, 2], \[3, 4]) is True
def test\_democratic\_not\_satisfied():
  """Democratic: {1,5} NOT < {3,4} because 5 is not < any element of {3,4}."""
  assert democratic\_comparison(\[1, 5], \[3, 4]) is False
def test\_undercutting\_always\_defeats():
  """Undercutting attack succeeds regardless of preference ordering."""
  # Even if attacker is strictly weaker, undercutting succeeds
def test\_rebutting\_fails\_if\_weaker():
  """Rebutting attack by a weaker argument does not become a defeat."""
  # Claim A (sample\_size=10) rebuts Claim B (sample\_size=1000)
  # Under last-link with sample\_size as the ordering signal,
  # A is strictly weaker, so the attack fails.### Integration tests (tests/test\_argumentation\_integration.py)
Build a small knowledge base with known conflicts and verify that:
1. The grounded extension excludes defeated claims
2. Preference ordering correctly filters weak attacks
3. pks world resolve --strategy argumentation returns the expected winner
4. The defeat table in the sidecar is populated correctly
5. Undercutting attacks are never filtered by preferences
6. Supersedes stances always produce defeats
### Property tests with Hypothesis
@given(st.lists(st.text(min\_size=1, max\_size=5), min\_size=2, max\_size=10, unique=True))
def test\_grounded\_is\_subset\_of\_every\_preferred(args):
  """The grounded extension is a subset of every preferred extension.
  Reference: Dung 1995, Theorem 25."""
  # Generate random AF and verify this property
@given(...)
def test\_preferred\_is\_admissible(args):
  """Every preferred extension is admissible (conflict-free + self-defending).
  Reference: Dung 1995, Definition 8."""
@given(...)
def test\_stable\_implies\_preferred(args):
  """Every stable extension is a preferred extension.
  Reference: Dung 1995, Theorem 13."""## File Layout (new and modified)
propstore/
  dung.py # NEW: Dung extension computation
  preference.py # NEW: preference ordering (last-link, weakest-link)
  argumentation.py # NEW: attack→defeat conversion, AF construction
  world/
  resolution.py # MODIFIED: add ARGUMENTATION strategy
  types.py # MODIFIED: add ResolutionStrategy.ARGUMENTATION
  build\_sidecar.py # MODIFIED: add defeat table
  cli/
  compiler\_cmds.py # MODIFIED: add --semantics/--preference/--set-comparison to resolve; add world extensions
tests/
  test\_dung.py # NEW
  test\_preference.py # NEW
  test\_argumentation\_integration.py # NEW
## Constraints
1. **Do not break existing tests.** Run the existing test suite before and after. The stance resolution strategy must continue to work unchanged.
2. **Do not modify the claim_stance schema.** The stance table is populated by propstore/relate.py via LLM classification. Your code consumes it, not produces it.
3. **Read the paper notes.** The notes in papers/Modgil\_2018\_GeneralAccountArgumentationPreferences/notes.md contain every definition, theorem number, and page reference you need. If you're unsure about a formal detail, read the notes rather than guessing.
4. **The preference ordering is the hard part.** Dung extension computation is well-understood algorithms. The novel design work is mapping propstore claim metadata (sample_size, uncertainty, recency, confidence) to the ordinal preference ordering that ASPIC+ requires. Make this mapping explicit, configurable, and documented.
5. **Grounded extension is the default.** It's unique, polynomial, and skeptical (only includes claims that survive all attacks). Preferred and stable are options for users who want credulous reasoning.
6. **Support stances are not attacks.** The supports and explains stance types do NOT participate in defeat computation. They are recorded in the stance graph but excluded from the ArgumentationFramework. A future bipolar extension (per Cayrol 2005) may use them, but not this implementation.
7. **supersedes is unconditional defeat.** Like undercutting, it always succeeds regardless of preferences. Unlike undercutting (which attacks an inference method), supersedes means "this claim replaces that claim entirely."
8. **Confidence threshold.** Only stances with confidence >= 0.5 should be treated as attacks. Lower-confidence stances are too uncertain to ground defeat relations. This threshold should be configurable.
## Definition Reference (from Modgil & Prakken 2018 notes)
These are the key formal definitions. Page numbers reference the paper.
- **Def 8 (p.11):** Three attack types — undermining, rebutting, undercutting
- **Def 9 (p.12):** Defeat = attack that survives preference filtering. Undercutting always succeeds. Rebutting/undermining succeed iff attacker is not strictly weaker.
- **Def 13 (p.14):** Defeat-based conflict-free (DEPRECATED by this paper)
- **Def 14 (p.14):** Attack-based conflict-free (RECOMMENDED — use this)
- **Def 18 (p.16):** Reasonable argument ordering
- **Def 19 (p.21):** Elitist and Democratic set comparisons
- **Def 20 (p.21):** Last-link preference principle
- **Def 21 (p.21):** Weakest-link preference principle
- **Prop 16 (p.19):** Under reasonable orderings, attack-based and defeat-based conflict-free yield the same extensions
- **Thms 12-15 (p.18-19):** Rationality postulates (sub-argument closure, strict rule closure, direct consistency, indirect consistency) — these are properties your implementation must satisfy, not things you implement directly