Skip to content

import-nanopub-chain: follow the FORRT backbone, not just refersToNanopub#8

Closed
annefou wants to merge 1 commit into
mainfrom
feat/importer-forrt-backbone
Closed

import-nanopub-chain: follow the FORRT backbone, not just refersToNanopub#8
annefou wants to merge 1 commit into
mainfrom
feat/importer-forrt-backbone

Conversation

@annefou

@annefou annefou commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Problem

The constellation importer (scripts/import-nanopub-chain.py) discovered neighbours only through the curated KnowledgePixels npa:refersToNanopub graph. For real FORRT chains that graph materialises essentially one edge — CiTO ↔ Outcome — so a BFS entering from the CiTO stopped after 2 nodes and never reached the Study, Claim, AIDA or Quote.

Those steps are linked by domain predicates the network graph doesn't index:

Outcome --isOutcomeOf-->     Study
Study   --targetsClaim-->    Claim
Claim   --asAidaStatement--> <purl.org/aida/…>  --(asserted by)-->  AIDA
AIDA    --related-->         Quote

Fix

  • backbone_neighbours() — reads every nanopub a node points at straight from its TriG, plus resolves the Claim→AIDA hop via a new aida-sentence-nanopub.rq query, then keeps only targets that are themselves FORRT chain steps (chain_step_kind()). Value-lists, templates, papers and other noise are dropped and never crawled, so this stays robust without hard-coding a predicate list.
  • walk() merges these with the existing refersToNanopub neighbours (new edge relation backbone).

Verified

Against a published 6-step chain (white-shark geolocation), entering from the CiTO:

Before: Imported 2 nanopubs (CiTO + Outcome).
After: Imported 6 nanopubs, 8 edges — Quote, AIDA, Claim, Study, Outcome, CiTO, each classified by template.

No new dependencies (rdflib already required). This also flows downstream to the per-replication repos and the GRID4EARTH benchmark template on their next sync.

…opub

The constellation importer walked only the curated KnowledgePixels
npa:refersToNanopub graph, which in practice links just CiTO <-> Outcome — so a
BFS from the CiTO stopped after 2 nodes and missed the Study, Claim, AIDA and
Quote. Those steps are connected by domain predicates the network graph does not
index:
  Outcome --isOutcomeOf--> Study --targetsClaim--> Claim
  Claim   --asAidaStatement--> <purl.org/aida/...> --(asserted by)--> AIDA
  AIDA    --related--> Quote

Add backbone_neighbours(): read every nanopub a node points at from its TriG,
resolve the Claim->AIDA hop via a new aida-sentence-nanopub.rq query, and keep
only targets that are themselves FORRT chain steps (chain_step_kind) so
value-lists/templates/papers are dropped and never crawled. walk() now merges
these with the refersToNanopub neighbours (edge relation 'backbone').

Verified against a published 6-step chain: entering from the CiTO now returns all
six steps (Quote, AIDA, Claim, Study, Outcome, CiTO) instead of two.
@annefou

annefou commented Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

Closing — wrong layer and wrong approach.

The canonical constellation is science-live-platform (api/src/np/constellation.ts, served at /np/constellation), and it already walks the full FORRT backbone: it mines every nanopub URI from each TriG (catching isOutcomeOf / targetsClaim) and bridges the Claim→AIDA gap via discoverAidaStatementNeighbours + the existing aida-statement-nanopub.rq query (committed 2026-05-31). Verified on dev: the white-shark CiTO returns all 6 steps (cito, outcome, study, claim, aida, quote).

This PR instead added hand-coded link-walking + heuristic step-type classification to the legacy import-nanopub-chain.py, which is exactly what that script's docstring says discovery should NOT do ("driven by the platform's pre-built SPARQL queries, rather than by hand-coded link-walking + heuristic step-type classification"), and it used the wrong placeholder convention (${aidaUri} instead of ?_name). If the legacy fallback ever needs to reach AIDA/Quote, the right fix is a query-driven re-sync that re-copies the platform's aida-statement-nanopub.rq and mirrors constellation.ts's TriG URI-mining + AIDA bridge — not these heuristics.

@annefou annefou closed this Jun 28, 2026
@annefou annefou deleted the feat/importer-forrt-backbone branch June 28, 2026 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant