A walkthrough project building a decentralized baseball knowledge graph on the Geo protocol using historical data from Retrosheet.
This repo is the reference codebase for a video walkthrough series covering: data exploration, ontology design, publishing to Geo, iterating, and building an app on top.
data/parsed/ contains 14 parsed files from Retrosheet, totalling ~9.6 GB:
| File | Records | Description |
|---|---|---|
| players.json | 26,961 | Player bios (names, birth info, handedness, HoF) |
| teams.json | 293 | Franchise history |
| ballparks.json | 656 | Stadium records |
| rosters.json | 125,566 | Year-by-year roster entries |
| gamelogs.ndjson | 237,580 | Game-level summaries (1871–2025) |
| ejections.json | 19,730 | Ejection records |
| schedules.ndjson | 238,816 | Schedule entries (1877–2026) |
| batting.ndjson | 5,746,328 | Per-game batting performances |
| pitching.ndjson | 1,269,889 | Per-game pitching performances |
| fielding.ndjson | 1,738,253 | Per-game fielding performances |
| plays.ndjson | 6,515,744 | Play-by-play records |
| allplayers.ndjson | 130,791 | All player records (alternate source) |
| gameinfo.ndjson | 224,877 | Detailed game info |
| special_collections.json | 2,733 | Special game collections |
See data_samples.txt for field-level examples from every file.
Retrosheet Disclosure: The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at www.retrosheet.org.
scripts/download_retrosheet.ts— downloads all Retrosheet data (already run)scripts/parse_*.ts— parses raw files to JSON/NDJSON (already run)scripts/summarize_data.ts— validates parsed data and prints a summary
src/constants.ts— system ontology IDs from the Geo root spacesrc/functions.ts—gql(),publishOps(), entity lookup helperssrc/entity_ops.ts—deleteEntity(),changeEntityId(),changeSpace(),mergeEntities()
01_api_demo.ts— GraphQL API exploration examples02_publish_demo.ts— Reference pattern for publishing entities to Geo03_delete_demo.ts— Delete entities from a space04_delete_entity.ts— Standalone entity delete utility07_entity_operations.ts— Template for running entity operations
knowledge-graph-ontology.md— Full GRC-20 ontology specspec.md— Full GRC-20 protocol specdocs/— SDK patterns, GraphQL API reference, entity operations, ontology IDswalkthrough_plan.md— Phase-by-phase walkthrough planwalkthrough_prompts.txt— Claude prompts for each walkthrough step
archive/ holds prior work from a previous run-through of the ontology design + publishing steps. Useful as reference when recreating the ontology.
- Bun installed
.envfile configured:PK_SW=0x... DEMO_SPACE_ID=... SW_ADDRESS=0x...- Geo Browser:
https://geobrowser.io
bun run scripts/summarize_data.ts # validate parsed data
bun run 01_api_demo.ts # explore the Geo API
bun run 02_publish_demo.ts # publish demo entities