tinyMem

Local, project-scoped memory system for language models with evidence-based truth validation.

tinyMem gives small and medium language models (7B–13B) reliable long-term memory in complex codebases. It sits between you and the LLM, injecting verified context and capturing validated facts—all locally, without model retraining or cloud dependencies.

🔍 What tinyMem Is (and Isn't)

tinyMem IS:

A deterministic, evidence-gated memory system for LLMs in long-lived codebases
Lexical recall engine (FTS5) with CoVe filtering for noise reduction
Truth state authority enforcement preventing hallucinated facts
Memory governance layer that decides what is known, recalled, and trusted

tinyMem IS NOT:

❌ An autonomous agent or execution engine
❌ A repair/retry loop system
❌ A semantic/vector search system
❌ A task execution framework

Core Principle: tinyMem governs memory, not behavior. It decides what is known—never what is done.

Evidence Boundary

tinyMem records and evaluates evidence but never executes commands to gather evidence.

Clear boundary:

Agents execute - Run tests, build code, verify behavior
tinyMem records - Stores evidence results (exit codes, file existence, grep matches)
tinyMem evaluates - Gates fact promotion based on evidence validity

Example:

Agent: Runs go test ./... and gets exit code 0
Agent: Submits evidence cmd_exit0::go test ./... with memory
tinyMem: Verifies evidence format and gates fact promotion
tinyMem: DOES NOT re-run the command itself

This keeps tinyMem as pure memory governance, never execution.

🎯 Why tinyMem?

If you've ever used an AI for a large project, you know it eventually starts to "forget." It forgets which database you chose, it forgets the naming conventions you agreed on, and it starts making things up (hallucinating).

tinyMem is a "Hard Drive for your AI's Brain."

🧬 Evolution: From Memory to Protocol

tinyMem was initially built to solve a specific problem: improving the reliability of small, locally-hosted LLMs (7B–13B). These models often suffer from "context drift" where they lose track of project decisions over long sessions.

As the project grew, we realized that memory alone wasn't enough. Reliability requires Truth Discipline. This led to the expansion of tinyMem into what it is today: a comprehensive Control Protocol that mandates evidence-based validation and strict execution phases for any agent touching a repository.

No more repeating yourself: "Remember, we use Go for the backend."
No more AI hallucinations: If the AI isn't sure, it checks its memory.
Total Privacy: Your project data never leaves your machine to "train" a model.

🚀 Quick Start

Get up and running in seconds.

1. Initialize

Go to your project root and bootstrap the memory system (this also downloads docs/agents/AGENT_CONTRACT*.md so your proxy/MCP layers have the exact system prompt they inject):

cd /path/to/your/project
tinymem init

If you just want to verify the installation afterward, run tinymem health.

2. Run

Start the server (choose one mode):

Option A: Proxy Mode (for generic LLM clients)

tinymem proxy
# Then point your client (e.g., OpenAI SDK) to http://localhost:8080/v1

Option B: MCP Mode (for Claude Desktop, Cursor, VS Code)

tinymem mcp
# Configure your IDE to run this command

📦 Installation

See the Quick Start Guide for Beginners for a detailed walkthrough.

Option 1: Pre-built Binary (Recommended)

Download from the Releases Page.

macOS / Linux:

os="$(uname -s | tr '[:upper:]' '[:lower:]')"
arch="$(uname -m)"
case "$arch" in
  x86_64|amd64) arch="amd64" ;;
  aarch64|arm64) arch="arm64" ;;
  *) echo "Unsupported arch: $arch" >&2; exit 1 ;;
esac
curl -L "https://github.com/daverage/tinyMem/releases/latest/download/tinymem-${os}-${arch}" -o tinymem
chmod +x tinymem
sudo mv tinymem /usr/local/bin/

Windows: Download tinymem-windows-amd64.exe, rename to tinymem.exe, and add to your system PATH.

Option 2: Build from Source

Requires Go 1.25.6+.

git clone https://github.com/daverage/tinyMem.git
cd tinyMem
./build/build.sh   # Build only
# or
./build/build.sh patch  # Release (patch version bump)

Cross-Compilation (on Mac): To build Windows or Linux binaries on a Mac, you need C cross-compilers:

For Windows (Intel/AMD): brew install mingw-w64
For Windows (ARM64): brew install zig
For Linux: brew install FiloSottile/musl-cross/musl-cross (static) or brew install zig

Cross-Compilation (on Windows): To build macOS or Linux binaries on Windows, you need Zig:

winget install zig.zig

Tip: zig is the recommended way to enable cross-compilation for all platforms with a single tool, regardless of whether you are on Mac or Windows.

Option 3: Container Image (GHCR)

Use the GitHub Container Registry image. Replace OWNER with your GitHub username or org (for this repo, daverage).

docker pull ghcr.io/OWNER/tinymem:latest
docker run --rm ghcr.io/OWNER/tinymem:latest health

💻 Usage

CLI Commands

The tinyMem CLI is your primary way to interact with the system from your terminal.

Command	What it is	Why use it?	Example
`health`	System Check	To make sure tinyMem is installed correctly and can talk to its database.	`tinymem health`
`stats`	Memory Overview	To see how many memories you've stored and how your tasks are progressing.	`tinymem stats`
`dashboard`	Visual Status	To get a quick, beautiful summary of your project's memory "health."	`tinymem dashboard`
`query`	Search	To find specific information you or the AI saved previously.	`tinymem query "API"`
`recent`	Recent History	To see the last few things tinyMem learned or recorded.	`tinymem recent`
`write`	Manual Note	To tell the AI something important that it should never forget.	`tinymem write --type decision --summary "Use Go 1.25"`
`run`	Command Wrapper	To run a script or tool (like `make` or `npm test`) while "reminding" it of project context.	`tinymem run make build`
`proxy` / `mcp`	Server Modes	To start the "brain" that connects tinyMem to your IDE or AI client.	`tinymem mcp`
`doctor`	Diagnostics	To fix the system if it stops working or has configuration issues.	`tinymem doctor`
`init`	Project Bootstrap	Creates `.tinyMem`, writes the config, and downloads the AGENT_CONTRACT/AGENT_CONTRACT_SMALL files into `docs/agents` so proxy and MCP modes can inject them without touching your README.	`tinymem init`
`update`	Refresh	Re-runs migrations and refreshes the configured agent contract files under `docs/agents` (large or small) to keep proxy/MCP injections in sync.	`tinymem update`

Writing Memories

Think of writing memories as "tagging" reality for the AI.

# Record a decision so the AI doesn't suggest an alternative later
tinymem write --type decision --summary "Switching to REST" --detail "GraphQL was too complex for this scale."

# Add a simple note for yourself or the AI
tinymem write --type note --summary "The database password is in the vault, not .env"

Memory Types & Truth

Type	Evidence Required?	Truth State	Recall Tier
Fact	✅ Yes	Verified	Always
Decision	✅ Yes (Confirmation)	Asserted	Contextual
Constraint	✅ Yes	Asserted	Always
Claim	❌ No	Tentative	Contextual
Plan	❌ No	Tentative	Opportunistic

Evidence types supported: file_exists, grep_hit, cmd_exit0, test_pass.

📝 tinyTasks: File-Authoritative Task Ledger

tinyTasks — file-authoritative task ledger enforced by tinyMem

tinyTasks is a built-in task management system that lives alongside your code in tinyTasks.md.

What tinyTasks Is:

File-authoritative - tinyTasks.md is the single source of truth
Human-authored - Only humans create and define tasks
Intent ledger - Grounds what work is authorized
Enforcement anchor - STRICT mode refuses work without tasks

What tinyMem Does With tinyTasks:

Reads the ledger through a server-managed TaskManager so it can identify which subtask still has authority.
Enforces that only the shared TaskManager path may mutate tasks and that MCP and proxy mode both obey the same deterministic boundary.
Guards against unauthorized writes and false completion claims by rejecting any file-level access to tinyTasks.md that bypasses the TaskManager.
Reports the validated state so agents know what work remains authorized before receiving execution feedback.

What tinyMem Does NOT Do:

Allow an LLM to read or write tinyTasks.md directly
Permit mutations outside the TaskManager/API boundary
Automatically mark tasks complete without explicit TaskManager commands
Create new tasks without a human-reviewed plan that the server accepts

tinyTasks exists to ground authority, not to drive execution.

Agents request intent and task updates via the server; every add/update/complete/list operation passes through the TaskManager, is validated for prerequisites, and only then reports success to the agent.

🔐 REDIRECTION Enforcement Prompts

tinyMem enforces the five server-controlled prompts from REDIRECTION.md. The code paths that validate these constraints are:

Prompt 1 — TaskManager Ownership: internal/tasks/manager.go defines the TaskManager APIs and both internal/server/mcp/server.go (lines 90-106) and internal/server/proxy/server.go (lines 81-110) instantiate the same manager, so every tinyTasks.md mutation flows through a single server path.
Prompt 2 — Intent Interpretation: internal/intent/definition.go defines every tool’s metadata, internal/server/tool_definitions.go adds that schema to each MCP tool, and internal/server/mcp/server.go#ensureIntent validates the declared category, minimum mode, recall requirement, and scope before any tool executes.
Prompt 3 — Unified Enforcement: The shared gate is ensureIntent, which delegates to execution.Controller and enforcement.Recorder (internal/execution/controller.go, internal/enforcement/recorder.go) so MCP tool calls and proxy mutations share the same deterministic, policy-driven decisions.
Prompt 4 — Memory Governance: internal/server/mcp/server.go#handleMemoryWrite parses the structured proposal (type, summary, detail, evidence), enforces recall/mode/evidence prerequisites, and only then persists through memory.Service, ensuring the server owns every memory write.
Prompt 5 — Metadata as Protocol: intent.Definition.Metadata plus server.ToolMetadata publish machine-readable intent data that enforcement consumes directly while the tool descriptions in internal/server/tool_definitions.go stay concise.

🔌 Integration

Proxy Mode

Intercepts standard OpenAI-compatible requests.

export OPENAI_API_BASE_URL=http://localhost:8080/v1
# Your existing scripts now use tinyMem automatically

tinymem init seeds docs/agents/AGENT_CONTRACT.md and AGENT_CONTRACT_SMALL.md and the proxy loads the configured file at startup, injecting it as the first system message unless the client already shipped the **Start of tinyMem Protocol** marker. This means your SDKs never need to resend the contract; tinyMem enforces it once per request.

While proxying, tinyMem now reports recall activity back to the client so that downstream UIs or agents can show “memory checked” indicators:

Streaming responses append an SSE event of type tinymem.memory_status once the upstream LLM finishes. The payload includes recall_count, recall_status (none/injected/failed), and a timestamp.
Non-streaming responses carry the same data via new headers: X-TinyMem-Recall-Status and X-TinyMem-Recall-Count. Agents or dashboards that read those fields can display whenever recall was applied or when the proxy skipped it.

MCP Server (IDE Integration)

Compatible with Claude Desktop, Cursor, and other MCP clients.

Claude Desktop Configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "tinymem": {
      "command": "/absolute/path/to/tinymem",
      "args": ["mcp"]
    }
  }
}

Available MCP Tools:

When tinyMem is running in MCP mode, your AI agent gains a fixed set of tools for recall, enforcement, and diagnostics:

memory_query: Search project memories with lexical recall (PASSIVE). Always run this before writing anything in GUARDED/STRICT modes.
memory_recent: Fetch the latest memories to understand the current context before acting.
memory_run_metadata: Inspect the enforcement run metadata (execution mode, events, proven counts) to prove the protocol was followed.
memory_claim_success: Report whether a claimed success was observed and enforced; adversarial claims without enforcement get flagged.
memory_set_mode: Declare PASSIVE, GUARDED, or STRICT before mutations; this is the intent gatekeeper for all memory writes.
memory_check_task_authority: Ask tinyMem to read tinyTasks.md and confirm which task (if any) has authority so the agent knows what work is authorized.
memory_write: Create or update memories with evidence (facts require verified evidence, and recall must happen first in GUARDED/STRICT modes).
memory_stats & memory_health: Surface memory counts, evidence coverage, and database health for monitoring.
memory_doctor: Run a self-diagnostic that checks configuration, database connectivity, and enforcement hooks.
memory_eval_stats: Retrieve evaluation metrics that measure how well memories, tasks, and enforcement performed during a session.

AI Agent Directives

CRITICAL: If you are building an AI agent, you MUST include the appropriate directive in its system prompt to ensure it uses tinyMem correctly.

Quick Setup: Run tinymem init once to bootstrap .tinyMem, create config, and download the AGENT_CONTRACT/AGENT_CONTRACT_SMALL files that the proxy and MCP layers inject. tinymem update reruns migrations and refreshes the same contract files (large or small) without mutating your README.

Claude: docs/agents/CLAUDE.md
Gemini: docs/agents/GEMINI.md
Qwen: docs/agents/QWEN.md
Other (Large LLMs): docs/agents/AGENT_CONTRACT.md
Other (Tiny LLMs): docs/agents/AGENT_CONTRACT_SMALL.md

📚 Guides & Examples

Detailed integration guides for various tools and ecosystems can be found in the examples/ directory:

Claude Integration (Desktop & CLI)
Aider Integration
GitHub Copilot
Local LLM Setup (Ollama, LM Studio)
IDE Configuration (Cursor, VS Code, Zed)

🏗 Architecture

flowchart TD
    User[LLM Client / IDE] <-->|Request/Response| Proxy[TinyMem Proxy / MCP]

    subgraph "1. Recall Phase"
        Proxy --> Recall[Recall Engine]
        Recall -->|FTS5 Lexical| DB[(SQLite)]
        Recall -->|CoVe Filter| Tiers{Recall Tiers}
        Tiers -->|Always/Contextual| Context[Context Injection]
    end

    subgraph "2. Extraction Phase"
        LLM[LLM Backend] -->|Stream| Proxy
        Proxy --> Extractor[Extractor]
        Extractor -->|Parse| CoVe{CoVe Filter}
        CoVe -->|High Conf| Evidence{Evidence Check}
        Evidence -->|Verified| DB
    end

    Context --> LLM

File Structure

.
├── .tinyMem/             # Project-scoped storage (DB, logs, config)
├── assets/               # Logos and icons
├── build/                # Build scripts
├── cmd/                  # Application entry points
├── docs/                 # Documentation & Agent Contracts
├── internal/             # Core logic (Memory, Evidence, Recall)
└── README.md             # This file

🔍 Visualizing & Diagnostics

tinyMem provides built-in tools to help you understand your project's memory state and health.

Dashboard: Run tinymem dashboard to see a visual summary of memories, tasks, and CoVe performance.
Doctor: Run tinymem doctor to perform a comprehensive diagnostic check of the database, configuration, and connectivity.
Stats: Run tinymem stats for a detailed terminal breakdown of memory types and task completion rates.

📉 Token Efficiency & Economics

These savings are empirically measured under identical workloads, not theoretical. See Evidence above for enforcement-backed benchmarks.

tinyMem uses more tokens per minute but significantly fewer tokens per task compared to standard agents.

Feature	Token Impact	Why?
Recall Engine	📉 Saves	Replaces "Read All Files" with targeted context snippets.
CoVe Filtering	📉 Saves	Reduces noise and improves recall precision, avoiding irrelevant context.
Context Reset	📉 Saves	Prevents chat history from snowballing by starting iterations fresh.
Truth Discipline	📉 Saves	Stops expensive "hallucination rabbit holes" before they start.

The Verdict: tinyMem acts as a "Sniper Rifle" for context. By ensuring the few tokens sent are the correct ones, it avoids the massive waste of re-reading files and debugging hallucinated code.

⚙ Configuration

Zero-config by default. Override in .tinyMem/config.toml:

[recall]
max_items = 10           # Maximum memories to recall per query

[cove]
enabled = true           # Chain-of-Verification (Extraction + Recall filtering)
confidence_threshold = 0.6

[execution]
mode = "STRICT"          # PASSIVE, GUARDED, or STRICT (default: STRICT)

[logging]
level = "info"           # "debug", "info", "warn", "error", "off"
file = "tinymem.log"     # Relative to .tinyMem/logs/

Environment Variables

For quick overrides, you can use:

TINYMEM_LOG_LEVEL=debug
TINYMEM_LLM_API_KEY=sk-...
TINYMEM_PROXY_PORT=8080

See Configuration Docs for details.

🛠 Development

# Run tests
go test ./...

# Build
./build/build.sh

See Task Management for how we track work.

🧪 Evidence: What tinyMem Actually Changes

tinyMem is designed to be provable, not aspirational. Its core claims are backed by automated, adversarial benchmarks that measure enforcement, memory stability, and token usage under identical conditions.

Benchmark Setup (Summary)

Runs: 40 identical scenarios per mode
Models: Local LLMs (7B–13B class)
Temperature: 0 (deterministic)
Scenarios:
- Forbidden task mutation
- Fact promotion without evidence
- Noisy / ambiguous memory extraction
Comparison:
- Baseline (no memory governance)
- tinyMem (full enforcement enabled)

All measurements are derived from enforced outcomes, not model claims.

🔒 Enforcement & Reliability

tinyMem treats blocking forbidden actions as success.

Across 40 runs:

Violations: 0
Forbidden actions blocked: 100%
False success claims detected: reduced by ~66%

This means:

The model may attempt unsafe or incorrect actions
tinyMem consistently detects and prevents them
No forbidden task edits or fact promotions slipped through

Enforcement failures are the only failure condition. None were observed.

This directly addresses:

hallucinated facts
silent task corruption
"looks right but is wrong" behavior

🧠 Memory Drift Prevention

Without governance, models routinely:

re-assert previously rejected decisions
contradict earlier facts
invent new "truths" under pressure

tinyMem prevents this structurally by:

Requiring evidence for fact promotion
Persisting verified facts across runs
Refusing contradictory durable writes

In benchmarks:

Baseline runs produced frequent unverified success claims
tinyMem downgraded or blocked these automatically
Verified facts remained stable across all runs

This is not prompt discipline. It is enforced state.

📉 Token Usage & Context Efficiency

tinyMem reduces token usage per completed task, even though it performs additional checks.

Across identical workloads:

Total tokens (baseline): ~32k
Total tokens (tinyMem): ~18k
Reduction: ~44%

Why this happens:

Targeted recall replaces "read everything"
CoVe filtering removes irrelevant context
Enforcement stops hallucination-driven retries
Context resets prevent runaway conversations

The result is fewer tokens wasted on:

re-reading files
debugging imaginary bugs
correcting false assumptions

What This Evidence Does Not Claim

tinyMem does not claim to:

make models smarter
increase raw success rates
eliminate hallucinations at generation time

It does guarantee:

hallucinations cannot become durable truth
unsafe actions are blocked, not trusted
memory remains consistent across time

🎯 Benchmarks

tinyMem is benchmarked on enforcement, not persuasion.

Tests measure whether forbidden actions are reliably blocked, whether hallucinated facts are prevented from becoming durable, and whether task and memory boundaries hold under repeated runs. Agent compliance is measured separately and never treated as authority.

Full methodology and results: BENCHMARK.md

✨ Key Features

Evidence-Based Truth: Typed memories (fact, claim, decision, etc.). Only verified claims become facts.
Chain-of-Verification (CoVe): LLM-based quality filter to reduce hallucinations before storage and improve recall relevance (enabled by default). See docs/COVE.md for details.
FTS5 Lexical Recall: Fast, deterministic full-text search across memory summaries and details using SQLite's FTS5 extension.
Automatic Database Maintenance: Self-healing database with automatic compaction (PRAGMA optimize + incremental vacuum) and optional retention policies to prevent unbounded growth.
Local & Private: Runs as a single binary. Data lives in .tinyMem/.
Zero Configuration: Works out of the box.
Dual Mode: Works as an HTTP Proxy or Model Context Protocol (MCP) server.
Mode Enforcement: PASSIVE, GUARDED, STRICT execution modes with authority boundaries.
Recall Tiers: Prioritizes Always (facts) > Contextual (decisions) > Opportunistic (notes).

🤝 Contributing

We value truth and reliability.

Truth Discipline: No shortcuts on verification.
Streaming: No buffering allowed.
Tests: Must pass go test ./....

See CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
.codex		.codex
.github/workflows		.github/workflows
.qwen		.qwen
assets		assets
bin		bin
build		build
cmd/tinymem		cmd/tinymem
docs		docs
examples		examples
internal		internal
test		test
web		web
.crush.json		.crush.json
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BENCHMARK.md		BENCHMARK.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
CRUSH.md		CRUSH.md
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
LICENSE		LICENSE
QWEN.md		QWEN.md
README.md		README.md
REDIRECTION.md		REDIRECTION.md
go.mod		go.mod
go.sum		go.sum
libllama_go.dylib		libllama_go.dylib
models.json		models.json

License

daverage/tinyMem

Folders and files

Latest commit

History

Repository files navigation

tinyMem

Local, project-scoped memory system for language models with evidence-based truth validation.

📖 Table of Contents

🔍 What tinyMem Is (and Isn't)

tinyMem IS:

tinyMem IS NOT:

Evidence Boundary

🎯 Why tinyMem?

🧬 Evolution: From Memory to Protocol

🚀 Quick Start

1. Initialize

2. Run

📦 Installation

Option 1: Pre-built Binary (Recommended)

Option 2: Build from Source

Option 3: Container Image (GHCR)

💻 Usage

CLI Commands

Writing Memories

Memory Types & Truth

📝 tinyTasks: File-Authoritative Task Ledger

🔐 REDIRECTION Enforcement Prompts

🔌 Integration

Proxy Mode

MCP Server (IDE Integration)

Available MCP Tools:

AI Agent Directives

📚 Guides & Examples

🏗 Architecture

File Structure

🔍 Visualizing & Diagnostics

📉 Token Efficiency & Economics

⚙ Configuration

Environment Variables

🛠 Development

🧪 Evidence: What tinyMem Actually Changes

Benchmark Setup (Summary)

🔒 Enforcement & Reliability

🧠 Memory Drift Prevention

📉 Token Usage & Context Efficiency

What This Evidence Does Not Claim

🎯 Benchmarks

✨ Key Features

🤝 Contributing

📄 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 34

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages