Skip to content

Add akd_ext.artifacts: Artifact model + ArtifactStore ABC#75

Merged
NISH1001 merged 7 commits into
developfrom
feature/artifacts-structure
Apr 21, 2026
Merged

Add akd_ext.artifacts: Artifact model + ArtifactStore ABC#75
NISH1001 merged 7 commits into
developfrom
feature/artifacts-structure

Conversation

@NISH1001
Copy link
Copy Markdown
Collaborator

@NISH1001 NISH1001 commented Apr 21, 2026

Summary

Introduces akd_ext.artifacts — a generic Artifact[T] pydantic model and ArtifactStore[T] ABC that will be the foundation for dynamic, context-loading agents. Artifacts are path-keyed markdown files (or any T-typed content) organized hierarchically; stores are pluggable backends (local filesystem, GitHub, S3, DB) that all share the same POSIX-style path semantics.

This PR lands the abstractions only; concrete backends come in follow-up PRs.

The ArtifactStore is designed to be injected directly into a pydantic-ai agent and its read_artifact / list_artifacts exposed as tools — so the agent can dynamically load only the context it needs at any step, instead of baking everything into the system prompt up front.

Broader goal: Collaborative Agent Reasoning Engineering (CARE). SMEs + devs + helper agents co-design an agent's artifacts in akd-labs (DB-backed), then auto-publish to nasa-impact/akd/agents/<x>/ on GitHub. A dynamic-context agent in akd-ext loads those artifacts from whichever backend it's pointed at — same slug-based paths, different backend.

What

  • Artifact[T] — generic pydantic model with path, name, description, content: T, metadata, created_at, updated_at. Path validator rejects .. segments / null bytes / empty paths and normalizes //, ./, leading /.
  • ArtifactStore[T] — ABC exposing:
    • Abstract (backend-specific I/O): load_artifacts, read_artifact, write_artifact. load_artifacts and refresh return Self for fluent chaining.
    • Concrete (cache-based, backend-free): list_artifacts(prefix), keys(prefix), index_for(dir_path) (per-backend overview file: index.md / README.md / SKILL.md), mapping dunders (__getitem__ / __setitem__ / __contains__ / __len__ / __iter__), and __str__ that renders a flat bulleted list of paths — drops straight into an LLM system prompt.
  • Path is a slug but treated like a POSIX filesystem path across all backends. Dict keys stay str for portability; PurePosixPath is used transiently for manipulation.
  • akd_ext/artifacts/stores/ stubbed as a subpackage so future backends each land in their own file with independent optional deps.

Usage

Implementing a concrete store

Subclass ArtifactStore[T] and implement the three abstracts. A runnable walkthrough is in notebooks/marimo/artifacts.py. Minimal in-memory example:

from akd_ext.artifacts import Artifact, ArtifactStore


class MemoryArtifactStore(ArtifactStore[str]):
    async def load_artifacts(self):
        # Backend-specific: walk fs / call GitHub / SELECT from DB,
        # then populate self[path] = Artifact(...)
        return self

    async def read_artifact(self, path: str) -> Artifact[str]:
        return self[path]

    async def write_artifact(self, artifact: Artifact[str]) -> Artifact[str]:
        self[artifact.path] = artifact
        return artifact


store = await MemoryArtifactStore(
    root="tmp/", index_file="index.md"
).load_artifacts()

await store.write_artifact(
    Artifact(path="tmp/index.md", content="# Overview")
)
art = await store.read_artifact("tmp/index.md")
idx = store.index_for("tmp")          # directory-level overview
print(store)                           # bulleted list, LLM-ready

Each concrete store interprets root natively: filesystem path for LocalDir, owner/repo/sub/path for GitHub, bucket/prefix for S3, path-prefix filter for DB. Slugs returned are always relative to root — agent code stays backend-agnostic.

Injecting into a pydantic-ai agent (follow-up, depends on #74)

Once the pydantic-ai base agent merges, the store plugs straight in:

@agent.system_prompt
async def system_prompt(ctx: RunContext[ArtifactStore]) -> str:
    store = ctx.deps
    overview = store.index_for()
    return (
        f"{overview.content if overview else ''}\n\n---\n\n"
        f"## Available artifacts\n{store}\n\n"
        "Use `read_artifact(path=...)` to load any artifact."
    )

@agent.tool
async def read_artifact(ctx: RunContext[ArtifactStore], path: str) -> str:
    return (await ctx.deps.read_artifact(path)).content

Testing

tests/artifacts/test_base.py covers the three core ops (load, read, write) via a minimal in-memory subclass. Also exercised interactively in notebooks/marimo/artifacts.py.

Running uv run pytest currently errors on all tests in the repo due to a pre-existing logfire / opentelemetry-api version mismatch in the venv (orthogonal to this PR). The new tests pass cleanly when invoked directly.

Next Steps

  • LocalDirArtifactStore — stdlib-only, ada-style eager load.
  • GitHubArtifactStore — via existing PyGithub dep.
  • Pydantic-ai agent integration (optional artifact_store on PydanticAIBaseAgentConfig, tool auto-wiring) — depends on Add PydanticAI runtime via PydanticAIBaseAgent #74.
  • DBArtifactStore — lives in akd-labs.

- Holds path-keyed `_artifacts` dict as the in-memory index, inspired
  by NISH1001/ada but generalized across backends (LocalDir, GitHub,
  S3, DB)
- Abstract methods: read_artifact, write_artifact, load_artifacts
- Concrete helpers on top of the cache: refresh, list_artifacts, keys,
  and mapping dunders (__getitem__, __contains__, __len__, __iter__)
- Also export ArtifactStore alongside Artifact in the package __init__
- Add required `root` positional to __init__ so every store has a
  uniform identifier (filesystem path for LocalDir, `owner/repo/path`
  for GitHub, `bucket/prefix` for S3, path-prefix filter for DB)
- `debug` is now keyword-only after `*`
- load_artifacts and refresh return Self for fluent chaining, mirroring
  ada's `_load_artifacts` pattern — e.g.
  `store = await LocalDirArtifactStore(root).load_artifacts()`
- Add __setitem__ and __delitem__ for cache-only mutation; subclasses
  use these inside read_artifact / write_artifact to keep the cache in
  sync after real backend I/O
- Add `index_file: str | None = "index.md"` kwarg so stores can be
  configured per-backend convention (`index.md` for dev/local,
  `README.md` for GitHub, `SKILL.md` for Anthropic skills, `AGENT.md`
  for agent manifests, `None` for DB)
- Add `index_for(dir_path)` helper that returns the designated overview
  artifact for a directory using the store's convention; tolerates
  leading/trailing slashes since paths are always relative to `root`
- Use `PurePosixPath` for path joining inside the ABC so slug ops stay
  POSIX-safe across backends (no Windows backslash surprise)
- Move `debug` to the last kwarg after `index_file`
- Reject clearly broken paths at model creation time: empty/whitespace,
  null byte, `..` segments (ambiguous traversal)
- Normalize obvious oddities via PurePosixPath: strip leading `/`,
  collapse `//` and `./` sequences
- Backend-specific path-traversal defense still belongs inside each
  store (e.g. LocalDirArtifactStore checking `is_relative_to(root)`)
- Keeps the model contract strict enough to catch typos/footguns
  without over-prescribing legitimate path shapes
- Flat bulleted list of paths with descriptions where available,
  matching ada's pattern: `"\n".join(f"- {k}" for k in store.keys())`
- Each line contains the exact path the caller passes to read_artifact,
  no mental reconstruction from indentation needed (LLM-friendly)
- Drops straight into an f-string in the pydantic-ai system_prompt:
  `f"## Available artifacts\n{store}\n\nUse read_artifact(path=...)"`
- Add 3 barebone tests covering the core ArtifactStore ops (load, read,
  write) via a minimal in-memory concrete subclass
- Stub out akd_ext/artifacts/stores/ as an empty subpackage so future
  backends (LocalDir, GitHub, S3) each land in their own file and can
  gate optional deps independently
@NISH1001 NISH1001 requested review from igaurab and sanzog03 April 21, 2026 16:47
@NISH1001 NISH1001 merged commit 67f191e into develop Apr 21, 2026
@NISH1001 NISH1001 deleted the feature/artifacts-structure branch April 21, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants