Skip to content

feat(skills): add hindsight-memory skill for persistent agent memory across sandbox sessions #371

@benfrank241

Description

@benfrank241

Problem Statement

Sandboxed agents are amnesiac by design — when a sandbox is destroyed, everything the agent learned is lost. This creates a poor experience for iterative workflows where agents work on the same project across multiple sessions. Agents repeatedly rediscover the same conventions, re-encounter the same bugs, and lose context about in-progress work.

Hindsight is an open-source agent memory system that provides long-term memory using biomimetic data structures (world facts, experience facts, mental models). Integrating it as an OpenShell skill would give sandboxed agents persistent memory without compromising sandbox isolation.

Technical Context

OpenShell's existing skill and provider architecture already supports this integration with no core changes required. The 17 existing skills all focus on agent actions (build, debug, review, triage) — none address agent cognition or continuity. This would be the first skill in that category.

Affected Components

Component Key Files Role
Agent skills .agents/skills/ Skill discovery and loading
Provider registry crates/openshell-providers/src/ Credential injection into sandboxes
Policy engine crates/openshell-policy/ Network egress control for Hindsight API access

Technical Investigation

Architecture Overview

The integration touches three layers of OpenShell, all using existing extension points:

  1. Agent Skill (.agents/skills/hindsight-memory/SKILL.md) — teaches agents the recall/retain workflow pattern. This is purely markdown, following the same structure as openshell-cli, debug-inference, and generate-sandbox-policy.

  2. Provider — Hindsight credentials (HINDSIGHT_API_KEY, HINDSIGHT_API_URL) are injected via the existing generic provider type. No new Rust provider plugin is required for the initial implementation. A first-class hindsight provider type could be added later for auto-discovery.

  3. Network Policy — a read-write policy allowing the hindsight binary to reach the Hindsight API on port 443. This uses the existing policy schema with no extensions needed. read-write (not full) blocks DELETE operations, following least-privilege principles.

Code References

Location Description
.agents/skills/*/SKILL.md Existing skill format — YAML frontmatter with name/description, numbered workflows, companion skills table
crates/openshell-providers/src/providers/generic.rs Generic provider — supports arbitrary credential injection today
architecture/security-policy.md Policy schema reference — read-write preset allows GET/POST but not DELETE
crates/openshell-providers/src/lib.rs:45-67 ProviderPlugin trait — extension point for a future first-class hindsight provider

Current Behavior

Agents in sandboxes have no mechanism for persisting knowledge across sessions. Each sandbox starts with a clean environment. The only continuity comes from the git repository itself and whatever the user manually communicates.

What Would Need to Change

Phase 1 (skill + policy, no Rust):

  • Add .agents/skills/hindsight-memory/SKILL.md with recall/retain/reflect workflows
  • Add .agents/skills/hindsight-memory/cli-reference.md with Hindsight CLI command reference
  • Add .agents/skills/hindsight-memory/example-policy.yaml with a complete sandbox policy template
  • Update CONTRIBUTING.md skills table to include the new skill

Phase 2 (optional, Rust):

  • Add crates/openshell-providers/src/providers/hindsight.rs implementing ProviderPlugin
  • Register in ProviderRegistry::new() and normalize_provider_type()
  • Auto-discovers HINDSIGHT_API_KEY and HINDSIGHT_API_URL from environment

Patterns to Follow

  • Skill structure matches openshell-cli (SKILL.md + reference doc) and generate-sandbox-policy (SKILL.md + examples)
  • Policy YAML follows the schema in architecture/security-policy.md
  • Provider plugin follows the pattern in anthropic.rs (simple env var discovery via ProviderDiscoverySpec)

Proposed Approach

Start with Phase 1: a pure-markdown skill that teaches agents the Hindsight recall/retain/reflect workflow, paired with an example network policy. This requires no Rust changes and fills a unique gap in the skill catalog — agent memory and cross-session continuity. A first-class Rust provider can follow in Phase 2 once the skill proves useful.

Scope Assessment

  • Complexity: Low
  • Confidence: High — uses only existing extension points
  • Estimated files to change: 4 (3 new skill files + CONTRIBUTING.md update)
  • Issue type: feat

Risks & Open Questions

  • Should the skill be bundled with OpenShell or live in the Hindsight repo? Bundling it here makes it discoverable alongside other skills; keeping it external reduces maintenance burden on OpenShell maintainers.
  • Should the example policy use access: read-write or explicit rules with specific API paths? read-write is simpler; explicit rules are more restrictive but couple the policy to the Hindsight API surface.
  • Should the Hindsight CLI be included in the base sandbox image, or left to custom images?
  • The Hindsight API URL (api.hindsight.vectorize.io) is for the hosted service. Self-hosted users would need to change the host and potentially add allowed_ips. The skill documents both paths.

Test Considerations

  • Skill files are markdown — no unit tests needed
  • Policy YAML should be validated against the policy schema (manual or via generate-sandbox-policy)
  • End-to-end testing would require a running Hindsight instance and a sandbox with the policy applied
  • A future provider plugin would follow the existing test pattern in anthropic.rs with MockDiscoveryContext

Created by spike investigation. Use build-from-issue to plan and implement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions