Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ jobs:
integrations-crewai: ${{ steps.filter.outputs.integrations-crewai }}
integrations-litellm: ${{ steps.filter.outputs.integrations-litellm }}
integrations-pydantic-ai: ${{ steps.filter.outputs.integrations-pydantic-ai }}
integrations-hermes: ${{ steps.filter.outputs.integrations-hermes }}
dev: ${{ steps.filter.outputs.dev }}
ci: ${{ steps.filter.outputs.ci }}
steps:
Expand Down Expand Up @@ -93,6 +94,8 @@ jobs:
- 'hindsight-integrations/litellm/**'
integrations-pydantic-ai:
- 'hindsight-integrations/pydantic-ai/**'
integrations-hermes:
- 'hindsight-integrations/hermes/**'
dev:
- 'hindsight-dev/**'
ci:
Expand Down Expand Up @@ -1586,6 +1589,40 @@ jobs:
working-directory: ./hindsight-integrations/pydantic-ai
run: uv run pytest tests -v

test-hermes-integration:
needs: [detect-changes]
if: >-
github.event_name == 'workflow_dispatch' ||
needs.detect-changes.outputs.integrations-hermes == 'true' ||
needs.detect-changes.outputs.ci == 'true'
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
prune-cache: false

- name: Set up Python
uses: actions/setup-python@v6
with:
python-version-file: ".python-version"

- name: Build hermes integration
working-directory: ./hindsight-integrations/hermes
run: uv build

- name: Install dependencies
working-directory: ./hindsight-integrations/hermes
run: uv sync --frozen

- name: Run tests
working-directory: ./hindsight-integrations/hermes
run: uv run pytest tests -v

test-pip-slim:
needs: [detect-changes]
if: >-
Expand Down
280 changes: 118 additions & 162 deletions hindsight-docs/docs/sdks/integrations/hermes.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,216 +4,172 @@ sidebar_position: 10

# Hermes Agent

Hindsight memory integration for [Hermes Agent](https://github.com/NousResearch/hermes-agent). Gives your Hermes agent persistent long-term memory via retain, recall, and reflect tools.
Persistent long-term memory for [Hermes Agent](https://github.com/NousResearch/hermes-agent) using [Hindsight](https://vectorize.io/hindsight). Automatically recalls relevant context before every LLM call and retains conversations for future sessions — plus explicit retain/recall/reflect tools.

## What it does

This package registers three tools into Hermes via its plugin system:

- **`hindsight_retain`** — Stores information to long-term memory. Hermes calls this when the user shares facts, preferences, or anything worth remembering.
- **`hindsight_recall`** — Searches long-term memory for relevant information. Returns a numbered list of matching memories.
- **`hindsight_reflect`** — Synthesizes a thoughtful answer from stored memories. Use this when you want Hermes to reason over what it knows rather than return raw facts.

These tools appear under the `[hindsight]` toolset in Hermes's `/tools` list.

## Setup

### 1. Install hindsight-hermes into the Hermes venv

The package must be installed in the **same Python environment** that Hermes runs in, so the entry point is discoverable.
## Quick Start

```bash
# 1. Install the plugin into Hermes's Python environment
uv pip install hindsight-hermes --python $HOME/.hermes/hermes-agent/venv/bin/python

# 2. Configure (choose one)
# Option A: Config file (recommended)
mkdir -p ~/.hindsight
cat > ~/.hindsight/hermes.json << 'EOF'
{
"hindsightApiUrl": "http://localhost:9077",
"bankId": "hermes"
}
EOF

# Option B: Environment variables
export HINDSIGHT_API_URL=http://localhost:9077
export HINDSIGHT_BANK_ID=hermes

# 3. Start Hermes — the plugin activates automatically
hermes
```

### 2. Set environment variables
## Features

The plugin reads its configuration from environment variables. Set these before launching Hermes:
- **Auto-recall** — on every turn, queries Hindsight for relevant memories and injects them into the system prompt (via `pre_llm_call` hook)
- **Auto-retain** — after every response, retains the user/assistant exchange to Hindsight (via `post_llm_call` hook)
- **Explicit tools** — `hindsight_retain`, `hindsight_recall`, `hindsight_reflect` for direct model control
- **Config file** — `~/.hindsight/hermes.json` with the same field names as openclaw and claude-code integrations
- **Zero config overhead** — env vars still work as overrides for CI/automation

```bash
# Required — tells the plugin where Hindsight is running
export HINDSIGHT_API_URL=http://localhost:8888
:::note
The lifecycle hooks (`pre_llm_call`/`post_llm_call`) require hermes-agent with [PR #2823](https://github.com/NousResearch/hermes-agent/pull/2823) or later. On older versions, only the three tools are registered — hooks are silently skipped.
:::

# Required — the memory bank to read/write. Think of this as a "brain" for one user or agent.
export HINDSIGHT_BANK_ID=my-agent
## Architecture

# Optional — only needed if using Hindsight Cloud (https://api.hindsight.vectorize.io)
export HINDSIGHT_API_KEY=your-api-key
The plugin registers via Hermes's `hermes_agent.plugins` entry point system:

# Optional — recall budget: low (fast), mid (default), high (thorough)
export HINDSIGHT_BUDGET=mid
```
| Component | Purpose |
|-----------|---------|
| `pre_llm_call` hook | **Auto-recall** — query memories, inject as ephemeral system prompt context |
| `post_llm_call` hook | **Auto-retain** — store user/assistant exchange to Hindsight |
| `hindsight_retain` tool | Explicit memory storage (model-initiated) |
| `hindsight_recall` tool | Explicit memory search (model-initiated) |
| `hindsight_reflect` tool | LLM-synthesized answer from stored memories |

If neither `HINDSIGHT_API_URL` nor `HINDSIGHT_API_KEY` is set, the plugin silently skips registration — Hermes starts normally without the Hindsight tools.
## Connection Modes

### 3. Disable Hermes's built-in memory tool
### 1. External API (recommended for production)

Hermes has its own `memory` tool that saves to local files (`~/.hermes/`). If both are active, the LLM tends to prefer the built-in one since it's familiar. Disable it so the LLM uses Hindsight instead:
Connect to a running Hindsight server (cloud or self-hosted). No local LLM needed — the server handles fact extraction.

```bash
hermes tools disable memory
```json
{
"hindsightApiUrl": "https://your-hindsight-server.com",
"hindsightApiToken": "your-token",
"bankId": "hermes"
}
```

This persists across sessions. You can re-enable it later with `hermes tools enable memory`.

### 4. Start Hindsight API
### 2. Local Daemon

Follow the [Quick Start](/developer/api/quickstart) guide to get the Hindsight API running, then come back here.
If you're running `hindsight-embed` locally, point to it:

### 5. Launch Hermes

```bash
hermes
```json
{
"hindsightApiUrl": "http://localhost:9077",
"bankId": "hermes"
}
```

Verify the plugin loaded by typing `/tools` — you should see:
Follow the [Quick Start](/developer/api/quickstart) guide to get the Hindsight API running.

```
[hindsight]
* hindsight_recall - Search long-term memory for relevant information.
* hindsight_reflect - Synthesize a thoughtful answer from long-term memories.
* hindsight_retain - Store information to long-term memory for later retrieval.
```
## Configuration

### 6. Test it
All settings are in `~/.hindsight/hermes.json`. Every setting can also be overridden via environment variables (env vars take priority).

**Store a memory:**
> Remember that my favourite colour is red
### Connection & Daemon

You should see `⚡ hindsight` in the response, confirming it called `hindsight_retain`.
| Setting | Default | Env Var | Description |
|---------|---------|---------|-------------|
| `hindsightApiUrl` | — | `HINDSIGHT_API_URL` | Hindsight API URL |
| `hindsightApiToken` | `null` | `HINDSIGHT_API_TOKEN` / `HINDSIGHT_API_KEY` | Auth token for API |
| `apiPort` | `9077` | `HINDSIGHT_API_PORT` | Port for local Hindsight daemon |
| `daemonIdleTimeout` | `0` | `HINDSIGHT_DAEMON_IDLE_TIMEOUT` | Seconds before idle daemon shuts down (0 = never) |
| `embedVersion` | `"latest"` | `HINDSIGHT_EMBED_VERSION` | `hindsight-embed` version for `uvx` |

**Recall a memory:**
> What's my favourite colour?
### LLM Provider (daemon mode only)

**Reflect on memories:**
> Based on what you know about me, suggest a colour scheme for my IDE
| Setting | Default | Env Var | Description |
|---------|---------|---------|-------------|
| `llmProvider` | auto-detect | `HINDSIGHT_LLM_PROVIDER` | LLM provider: `openai`, `anthropic`, `gemini`, `groq`, `ollama` |
| `llmModel` | provider default | `HINDSIGHT_LLM_MODEL` | Model override |

This calls `hindsight_reflect`, which synthesizes a response from all stored memories.
### Memory Bank

**Verify via API:**
| Setting | Default | Env Var | Description |
|---------|---------|---------|-------------|
| `bankId` | — | `HINDSIGHT_BANK_ID` | Memory bank ID |
| `bankMission` | `""` | `HINDSIGHT_BANK_MISSION` | Agent identity/purpose for the memory bank |
| `retainMission` | `null` | — | Custom retain mission (what to extract from conversations) |
| `bankIdPrefix` | `""` | — | Prefix for all bank IDs |

```bash
curl -s http://localhost:8888/v1/default/banks/my-agent/memories/recall \
-H "Content-Type: application/json" \
-d '{"query": "favourite colour", "budget": "low"}' | python3 -m json.tool
```
### Auto-Recall

## Troubleshooting
| Setting | Default | Env Var | Description |
|---------|---------|---------|-------------|
| `autoRecall` | `true` | `HINDSIGHT_AUTO_RECALL` | Enable automatic memory recall via `pre_llm_call` hook |
| `recallBudget` | `"mid"` | `HINDSIGHT_RECALL_BUDGET` | Recall effort: `low`, `mid`, `high` |
| `recallMaxTokens` | `4096` | `HINDSIGHT_RECALL_MAX_TOKENS` | Max tokens in recall response |
| `recallMaxQueryChars` | `800` | `HINDSIGHT_RECALL_MAX_QUERY_CHARS` | Max chars of user message used as query |
| `recallPromptPreamble` | see below | — | Header text injected before recalled memories |

### Tools don't appear in `/tools`
Default preamble:
> Relevant memories from past conversations (prioritize recent when conflicting). Only use memories that are directly useful to continue this conversation; ignore the rest:

1. **Check the plugin is installed in the right venv.** Run this from the Hermes venv:
```bash
python -c "from hindsight_hermes import register; print('OK')"
```
### Auto-Retain

2. **Check the entry point is registered:**
```bash
python -c "
import importlib.metadata
eps = importlib.metadata.entry_points(group='hermes_agent.plugins')
print(list(eps))
"
```
You should see `EntryPoint(name='hindsight', value='hindsight_hermes', group='hermes_agent.plugins')`.
| Setting | Default | Env Var | Description |
|---------|---------|---------|-------------|
| `autoRetain` | `true` | `HINDSIGHT_AUTO_RETAIN` | Enable automatic retention via `post_llm_call` hook |
| `retainEveryNTurns` | `1` | — | Retain every Nth turn |
| `retainOverlapTurns` | `2` | — | Extra overlap turns for continuity |
| `retainRoles` | `["user", "assistant"]` | — | Which message roles to retain |

3. **Check env vars are set.** The plugin skips registration silently if `HINDSIGHT_API_URL` and `HINDSIGHT_API_KEY` are both unset.
### Miscellaneous

### Hermes uses built-in memory instead of Hindsight
| Setting | Default | Env Var | Description |
|---------|---------|---------|-------------|
| `debug` | `false` | `HINDSIGHT_DEBUG` | Enable debug logging to stderr |

Run `hermes tools disable memory` and restart. The built-in `memory` tool and Hindsight tools have overlapping purposes — the LLM will prefer whichever it's more familiar with, which is usually the built-in one.
## Hermes Gateway (Telegram, Discord, Slack)

### Bank not found errors
When using Hermes in gateway mode (multi-platform messaging), the plugin works across all platforms. Hermes creates a fresh `AIAgent` per message, and the plugin's `pre_llm_call` hook ensures relevant memories are recalled for each turn regardless of platform.

The plugin auto-creates banks on first use. If you see bank errors, check that the Hindsight API is running and `HINDSIGHT_API_URL` is correct.
## Disabling Hermes's Built-in Memory

### Connection refused
Hermes has a built-in `memory` tool that saves to local markdown files. If both are active, the LLM may prefer the built-in one. Disable it:

Make sure the Hindsight API is running and listening on the URL you configured. Test with:
```bash
curl http://localhost:8888/health
```

## Manual registration (advanced)

If you don't want to use the plugin system, you can register tools directly in a Hermes startup script or custom agent:

```python
from hindsight_hermes import register_tools

register_tools(
bank_id="my-agent",
hindsight_api_url="http://localhost:8888",
budget="mid",
tags=["hermes"], # applied to all retained memories
recall_tags=["hermes"], # filter recall to only these tags
)
```

This imports `tools.registry` from Hermes at call time and registers the three tools directly. This approach gives you more control over parameters but requires Hermes to be importable.

## Memory instructions (system prompt injection)

Pre-recall memories at startup and inject them into the system prompt, so the agent starts every conversation with relevant context:

```python
from hindsight_hermes import memory_instructions

context = memory_instructions(
bank_id="my-agent",
hindsight_api_url="http://localhost:8888",
query="user preferences and important context",
budget="low",
max_results=5,
)
# Returns:
# Relevant memories:
# 1. User's favourite colour is red
# 2. User prefers dark mode
hermes tools disable memory
```

This never raises — if the API is down or no memories exist, it returns an empty string.

## Global configuration (advanced)

Instead of passing parameters to every call, configure once:
Re-enable later with `hermes tools enable memory`.

```python
from hindsight_hermes import configure
## Troubleshooting

configure(
hindsight_api_url="http://localhost:8888",
api_key="your-key",
budget="mid",
tags=["hermes"],
)
**Plugin not loading**: Verify the entry point is registered:
```bash
python -c "
import importlib.metadata
eps = importlib.metadata.entry_points(group='hermes_agent.plugins')
print(list(eps))
"
```
You should see `EntryPoint(name='hindsight', value='hindsight_hermes', ...)`.

Subsequent calls to `register_tools()` or `memory_instructions()` will use these defaults if no explicit values are provided.

## MCP alternative
**Tools don't appear in `/tools`**: Check that `hindsightApiUrl` (or `HINDSIGHT_API_URL`) is set. The plugin silently skips registration when unconfigured.

Hermes also supports MCP servers natively. You can use Hindsight's MCP server directly instead of this plugin — no `hindsight-hermes` package needed:

```yaml
# In your Hermes config
mcp_servers:
- name: hindsight
url: http://localhost:8888/mcp
**Connection refused**: Verify the Hindsight API is running:
```bash
curl http://localhost:9077/health
```

This exposes the same retain/recall/reflect operations through Hermes's MCP integration. The tradeoff is that MCP tools may have different naming and the LLM needs to discover them, whereas the plugin registers tools with Hermes-native schemas.

## Configuration reference

| Parameter | Env Var | Default | Description |
|-----------|---------|---------|-------------|
| `hindsight_api_url` | `HINDSIGHT_API_URL` | `https://api.hindsight.vectorize.io` | Hindsight API URL |
| `api_key` | `HINDSIGHT_API_KEY` | — | API key for authentication |
| `bank_id` | `HINDSIGHT_BANK_ID` | — | Memory bank ID |
| `budget` | `HINDSIGHT_BUDGET` | `mid` | Recall budget (low/mid/high) |
| `max_tokens` | — | `4096` | Max tokens for recall results |
| `tags` | — | — | Tags applied when storing memories |
| `recall_tags` | — | — | Tags to filter recall results |
| `recall_tags_match` | — | `any` | Tag matching mode (any/all/any_strict/all_strict) |
| `toolset` | — | `hindsight` | Hermes toolset group name |
**Recall returning no memories**: Memories need at least one retain cycle. Try storing a fact first, then asking about it in a new session.
Loading
Loading