propstore has canonical authored artifacts in knowledge/, plus a compiled SQLite sidecar used as the read/query surface. Git/YAML is the source of truth; the sidecar is a versioned materialization.
The main authored entities are sources, concepts, forms, claims, stances, authored justifications, and contexts. Conditions are fields on claims and relations, not a separate top-level entity.
Sources live in sources/<slug>.yaml and are first-class provenance records. Claims compile with a stable source_slug foreign-key-style reference to these rows.
id: tag:example.org,2026:halpin-2010
kind: academic_paper
origin:
type: doi
value: 10.1016/j.websem.2010.01.001
retrieved: 2026-04-07
trust:
prior_base_rate: 0.6
quality:
venue: peer_reviewed
artifact_code: sha256:...Source-branch notes.md and metadata remain canonical git artifacts. They are not compiled into claim reasoning tables.
A concept is a named quantity, category, or structural entity. One YAML file per concept in concepts/, filename matches canonical_name.
id: concept1
canonical_name: fundamental_frequency
status: accepted
definition: Rate of vocal fold vibration during phonation.
domain: speech
form: frequency
aliases:
- name: F0
source: common
- name: pitch_frequency
source: Titze_1994
relationships:
- type: component_of
target: concept5
parameterization_relationships:
- formula: "f0 = 1 / T0"
sympy: "Eq(concept1, 1 / concept2)"
inputs: [concept2]
exactness: exact
source: definition
bidirectional: trueStatus lifecycle: proposed -> accepted -> deprecated (with replaced_by pointer). Concepts are never deleted.
Kind system: Each concept has a form referencing a form definition file. The form determines the concept's kind:
| Kind | Examples | CEL behavior |
|---|---|---|
quantity |
frequency, pressure, duration | Numeric comparisons and arithmetic |
category |
voice_quality_type, language | Equality and in checks against value sets |
boolean |
is_voiced | Boolean logic |
structural |
voice_source | Cannot appear in CEL expressions |
timepoint |
valid_from, valid_until | Numeric comparisons (epoch seconds); automatic interval ordering constraints; not valid for parameterization/dimensional algebra |
Form definitions live in forms/<name>.yaml and define dimensional type signatures:
name: frequency
unit_symbol: Hz
dimensionless: false
dimensions:
T: -1
common_alternatives:
- unit: kHz
type: multiplicative
multiplier: 0.001The compiler uses form definitions for unit validation via dimensional analysis, checking that claim units are compatible with concept dimensions.
The common_alternatives array defines how non-SI units convert to the form's SI base unit. Three conversion types:
- Multiplicative:
si_value = raw * multiplier. Example: kHz has multiplier 1000, so 5 kHz becomes 5000 Hz. - Affine:
si_value = raw * multiplier + offset. Used for temperature scales — degC uses offset 273.15 to convert to Kelvin. - Logarithmic:
si_value = reference * base^(raw / divisor). Used for decibel scales — dB SPL uses base 10, divisor 20, reference 0.00002 Pa.
During sidecar build, all claim values are normalized to SI via these conversions (with pint as fallback for standard unit prefixes). The sidecar stores both raw and SI-normalized values (value_si, lower_bound_si, upper_bound_si).
The extra_units field registers domain-specific units not recognized by pint. Each entry has a symbol and optionally dimensions. These units are added to the form's allowed unit set and registered into the dimensional analysis symbol table.
See units-and-forms.md for full details on SI normalization, form algebra, and dimensional analysis.
Claims are extracted from papers and stored in claims/<paper_name>.yaml. There are nine claim types.
A numeric value binding for a concept under stated conditions:
- id: claim1
type: parameter
concept: concept1
value: 120.0
uncertainty: 15.0
uncertainty_type: sd
unit: Hz
conditions:
- "speaker_sex == 'male'"
provenance:
paper: Titze_1994
page: 42- id: claim1
type: parameter
concept: ad_reading_speed
value: 180.0
unit: "words/min"
conditions:
- "task == 'audio_description'"
provenance:
paper: Li_2026_ADCanvas
page: 8A mathematical relationship with variable bindings:
- id: claim10
type: equation
expression: "OQ = (T_o) / T_0"
sympy: "Eq(OQ, T_o / T_0)"
variables:
- symbol: OQ
concept: concept3
- symbol: T_o
concept: concept4
- symbol: T_0
concept: concept2
provenance:
paper: Henrich_2003
page: 8A perceptual or behavioral measurement:
- id: claim20
type: measurement
target_concept: concept1
measure: jnd_relative
value: 0.003
unit: ratio
listener_population: native_english
provenance:
paper: Moore_1973
page: 15A qualitative claim that resists parameterization:
- id: claim30
type: observation
statement: "Breathiness increases with incomplete glottal closure"
concepts: [concept7, concept8]
provenance:
paper: Klatt_1990
page: 22A parameterized equation system:
- id: claim40
type: model
name: "Klatt cascade formant synthesizer"
equations:
- "output = cascade(F1, F2, F3, F4, F5)"
parameters:
- name: F1
concept: concept10
provenance:
paper: Klatt_1980
page: 5A procedural computation as a Python function body:
- id: claim50
type: algorithm
concept: concept12
stage: excitation
body: |
def glottal_pulse(t, T0, Tp, Tn):
if t < Tp:
return 0.5 * (1 - math.cos(math.pi * t / Tp))
elif t < Tp + Tn:
return math.cos(math.pi * (t - Tp) / (2 * Tn))
else:
return 0.0
variables:
- name: t
concept: concept60
- name: T0
concept: concept2
provenance:
paper: Klatt_1980
page: 12A causal or explanatory process linking concepts:
- id: claim60
type: mechanism
statement: "Undercutting defeat removes the connection between premise and conclusion without challenging the premise itself"
concepts: [undercutting_attack, defeasible_reasoning]
provenance:
paper: Pollock_1987
page: 485A comparative claim between approaches, methods, or systems:
- id: claim61
type: comparison
statement: "Preferred semantics produces more extensions than grounded semantics on frameworks with even-length cycles"
concepts: [preferred_extension, grounded_extension]
provenance:
paper: Dung_1995
page: 331A known boundary, failure case, or applicability constraint:
- id: claim62
type: limitation
statement: "Stable extensions are not guaranteed to exist for all argumentation frameworks"
concepts: [stable_extension, argumentation_framework]
provenance:
paper: Dung_1995
page: 328Claims and relationships can be scoped by conditions — CEL (Common Expression Language) expressions that define when they hold:
conditions:
- "speaker_sex == 'male'"
- "task == 'speech'"The compiler type-checks conditions against the concept registry, and production runtime evaluation uses the same Z3-backed CEL semantics:
- quantity concepts use numeric comparisons
- boolean concepts use boolean logic
- closed categories (
extensible: false) use finite enum semantics, so undeclared literals are hard errors - open categories (
extensible: true) use symbolic string semantics, so undeclared literals remain semantically valid and only warn at check time - unknown concept names are hard errors everywhere
Justifications are inference rules that connect premise claims to conclusion claims. There are two distinct cases:
- Authored justifications live in
justifications/<source>.yamland compile into the sidecarjustificationtable. - Runtime-derived justifications such as
reported:claim_idandsupports:a->bare built from the active claim graph when argumentation code needs them. They are not persisted in the sidecar.
Each justification is a CanonicalJustification with these fields:
| Field | Type | Default | Description |
|---|---|---|---|
justification_id |
str |
— | Unique identifier (e.g., reported:claim1 or supports:claim2->claim3) |
conclusion_claim_id |
str |
— | The claim this justification concludes |
premise_claim_ids |
tuple[str, ...] |
() |
Claims that serve as premises |
rule_kind |
str |
"reported_claim" |
Type of inference rule |
rule_strength |
str |
"defeasible" |
Whether the rule is strict or defeasible |
provenance |
ProvenanceRecord | None |
None |
Source attribution |
attributes |
tuple[tuple[str, Any], ...] |
() |
Additional metadata as sorted key-value pairs |
Three values:
reported_claim— Every claim automatically gets areported_claimjustification. This represents the claim's direct assertion from its source paper, with no premises.supports— A premise claim provides corroborating evidence for the conclusion claim.explains— A premise claim provides a mechanistic explanation for the conclusion claim.
Two values, corresponding to ASPIC+ rule types (Modgil & Prakken 2018, Def 2):
strict— The inference is logically unattackable. Strict rules have no name and cannot be undercut.defeasible— The inference is tentative and can be undercut. Defeasible rules are named by theirjustification_id, which enables targeted undercutting (Def 8c).
Justifications translate directly to ASPIC+ rules via the bridge in aspic_bridge.py. reported_claim justifications become knowledge base premises (not rules). Justifications with premises become strict or defeasible rules depending on rule_strength. See structured-argumentation.md for the full translation pipeline (T1–T7).
An undercuts stance can include a target_justification_id field to attack a specific defeasible rule rather than all rules concluding a given claim. When multiple defeasible rules support the same conclusion, omitting target_justification_id raises an ambiguity error. This implements Pollock's (1987) undercutting defeat: the attacker targets the inference rule itself, not the conclusion.
Authored justifications are optional and look like:
justifications:
- id: just1
conclusion: claim_observation
premises: [claim_parameter]
rule_kind: reported_claim
rule_strength: defeasibleThe sidecar stores authored justifications exactly so source-authored inference structure remains queryable. Support and explanation edges still participate in argumentation, but their CanonicalJustification records are synthesized at runtime from the active graph.
Claims can express epistemic relations to other claims:
stances:
- type: rebuts
target: claim15
strength: strong
note: "Contradicting conclusion with larger sample size"
- type: supersedes
target: claim42Six stance types (ASPIC+ taxonomy, active voice — the claim holding the stance acts on the target):
| Type | Category | Weight | Meaning |
|---|---|---|---|
rebuts |
Attack | -1.0 | Directly contradicts the target's conclusion |
undercuts |
Attack | -1.0 | Attacks the inference method or reasoning |
undermines |
Attack | -0.5 | Weakens a premise or evidence quality |
supports |
Support | +1.0 | Provides corroborating evidence |
explains |
Support | +0.5 | Provides a mechanistic explanation |
supersedes |
Preference | --- | Replaces the target entirely (short-circuits resolution) |
Based on ASPIC+ (Modgil & Prakken 2014) and Pollock's rebutting vs undercutting distinction (Prakken & Horty 2012).
Stances feed into the argumentation framework — attacks become defeat candidates filtered through preference ordering, supports contribute to claim strength.
Contexts represent research traditions, theoretical frameworks, or experimental paradigms that scope groups of claims. One YAML file per context in contexts/:
id: ctx_abstract_argumentation
name: ctx_abstract_argumentation
description: Dung's abstract argumentation framework tradition — arguments as abstract
entities with attack relations, multiple acceptability semantics
structure:
assumptions:
- "domain == 'argumentation'"
parameters:
tradition: abstract
perspective: dung
lifting_rules:
- id: lift_dung_to_argumentation
source:
id: ctx_abstract_argumentation
target:
id: ctx_argumentation
mode: monotonicClaims reference their context via context: {id: ...}. The compiler validates that all context references resolve to registered contexts. Contexts are structured logical terms with authored assumptions, parameters, and perspective metadata. Visibility inheritance is not a production concept; cross-context visibility is granted only by explicit lifting rules.
The data model is defined in LinkML at schema/concept_registry.linkml.yaml and schema/claim.linkml.yaml. JSON Schema is generated from these for validation. Run schema/generate.py to regenerate.