Skip to content

Local, project-scoped memory system for LLMs with evidence-based truth validation. Provides reliable long-term context via OpenAI-compatible Proxy and MCP, using Chain-of-Verification (CoVe) to eliminate hallucinations and the Ralph Loop for autonomous codebase repair.

License

Notifications You must be signed in to change notification settings

daverage/tinyMem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

125 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

tinyMem

tinyMem logo

License: MIT Go 1.25.6+ Build Status

Local, project-scoped memory system for language models with evidence-based truth validation.


tinyMem gives small and medium language models (7B–13B) reliable long-term memory in complex codebases. It sits between you and the LLM, injecting verified context and capturing validated factsβ€”all locally, without model retraining or cloud dependencies.

πŸ“– Table of Contents


πŸ” What tinyMem Is (and Isn't)

tinyMem IS:

  • A deterministic, evidence-gated memory system for LLMs in long-lived codebases
  • Lexical recall engine (FTS5) with CoVe filtering for noise reduction
  • Truth state authority enforcement preventing hallucinated facts
  • Memory governance layer that decides what is known, recalled, and trusted

tinyMem IS NOT:

  • ❌ An autonomous agent or execution engine
  • ❌ A repair/retry loop system
  • ❌ A semantic/vector search system
  • ❌ A task execution framework

Core Principle: tinyMem governs memory, not behavior. It decides what is knownβ€”never what is done.

Evidence Boundary

tinyMem records and evaluates evidence but never executes commands to gather evidence.

Clear boundary:

  • Agents execute - Run tests, build code, verify behavior
  • tinyMem records - Stores evidence results (exit codes, file existence, grep matches)
  • tinyMem evaluates - Gates fact promotion based on evidence validity

Example:

  • Agent: Runs go test ./... and gets exit code 0
  • Agent: Submits evidence cmd_exit0::go test ./... with memory
  • tinyMem: Verifies evidence format and gates fact promotion
  • tinyMem: DOES NOT re-run the command itself

This keeps tinyMem as pure memory governance, never execution.


🎯 Why tinyMem?

If you've ever used an AI for a large project, you know it eventually starts to "forget." It forgets which database you chose, it forgets the naming conventions you agreed on, and it starts making things up (hallucinating).

tinyMem is a "Hard Drive for your AI's Brain."

🧬 Evolution: From Memory to Protocol

tinyMem was initially built to solve a specific problem: improving the reliability of small, locally-hosted LLMs (7B–13B). These models often suffer from "context drift" where they lose track of project decisions over long sessions.

As the project grew, we realized that memory alone wasn't enough. Reliability requires Truth Discipline. This led to the expansion of tinyMem into what it is today: a comprehensive Control Protocol that mandates evidence-based validation and strict execution phases for any agent touching a repository.

  • No more repeating yourself: "Remember, we use Go for the backend."
  • No more AI hallucinations: If the AI isn't sure, it checks its memory.
  • Total Privacy: Your project data never leaves your machine to "train" a model.

πŸš€ Quick Start

Get up and running in seconds.

1. Initialize

Go to your project root and bootstrap the memory system (this also downloads docs/agents/AGENT_CONTRACT*.md so your proxy/MCP layers have the exact system prompt they inject):

cd /path/to/your/project
tinymem init

If you just want to verify the installation afterward, run tinymem health.

2. Run

Start the server (choose one mode):

Option A: Proxy Mode (for generic LLM clients)

tinymem proxy
# Then point your client (e.g., OpenAI SDK) to http://localhost:8080/v1

Option B: MCP Mode (for Claude Desktop, Cursor, VS Code)

tinymem mcp
# Configure your IDE to run this command

πŸ“¦ Installation

See the Quick Start Guide for Beginners for a detailed walkthrough.

Option 1: Pre-built Binary (Recommended)

Download from the Releases Page.

macOS / Linux:

os="$(uname -s | tr '[:upper:]' '[:lower:]')"
arch="$(uname -m)"
case "$arch" in
  x86_64|amd64) arch="amd64" ;;
  aarch64|arm64) arch="arm64" ;;
  *) echo "Unsupported arch: $arch" >&2; exit 1 ;;
esac
curl -L "https://github.com/daverage/tinyMem/releases/latest/download/tinymem-${os}-${arch}" -o tinymem
chmod +x tinymem
sudo mv tinymem /usr/local/bin/

Windows: Download tinymem-windows-amd64.exe, rename to tinymem.exe, and add to your system PATH.

Option 2: Build from Source

Requires Go 1.25.6+.

git clone https://github.com/daverage/tinyMem.git
cd tinyMem
./build/build.sh   # Build only
# or
./build/build.sh patch  # Release (patch version bump)

Cross-Compilation (on Mac): To build Windows or Linux binaries on a Mac, you need C cross-compilers:

  • For Windows (Intel/AMD): brew install mingw-w64
  • For Windows (ARM64): brew install zig
  • For Linux: brew install FiloSottile/musl-cross/musl-cross (static) or brew install zig

Cross-Compilation (on Windows): To build macOS or Linux binaries on Windows, you need Zig:

  • winget install zig.zig

Tip: zig is the recommended way to enable cross-compilation for all platforms with a single tool, regardless of whether you are on Mac or Windows.

Option 3: Container Image (GHCR)

Use the GitHub Container Registry image. Replace OWNER with your GitHub username or org (for this repo, daverage).

docker pull ghcr.io/OWNER/tinymem:latest
docker run --rm ghcr.io/OWNER/tinymem:latest health

πŸ’» Usage

CLI Commands

The tinyMem CLI is your primary way to interact with the system from your terminal.

Command What it is Why use it? Example
health System Check To make sure tinyMem is installed correctly and can talk to its database. tinymem health
stats Memory Overview To see how many memories you've stored and how your tasks are progressing. tinymem stats
dashboard Visual Status To get a quick, beautiful summary of your project's memory "health." tinymem dashboard
query Search To find specific information you or the AI saved previously. tinymem query "API"
recent Recent History To see the last few things tinyMem learned or recorded. tinymem recent
write Manual Note To tell the AI something important that it should never forget. tinymem write --type decision --summary "Use Go 1.25"
run Command Wrapper To run a script or tool (like make or npm test) while "reminding" it of project context. tinymem run make build
proxy / mcp Server Modes To start the "brain" that connects tinyMem to your IDE or AI client. tinymem mcp
doctor Diagnostics To fix the system if it stops working or has configuration issues. tinymem doctor
init Project Bootstrap Creates .tinyMem, writes the config, and downloads the AGENT_CONTRACT/AGENT_CONTRACT_SMALL files into docs/agents so proxy and MCP modes can inject them without touching your README. tinymem init
update Refresh Re-runs migrations and refreshes the configured agent contract files under docs/agents (large or small) to keep proxy/MCP injections in sync. tinymem update

Writing Memories

Think of writing memories as "tagging" reality for the AI.

# Record a decision so the AI doesn't suggest an alternative later
tinymem write --type decision --summary "Switching to REST" --detail "GraphQL was too complex for this scale."

# Add a simple note for yourself or the AI
tinymem write --type note --summary "The database password is in the vault, not .env"

Memory Types & Truth

Type Evidence Required? Truth State Recall Tier
Fact βœ… Yes Verified Always
Decision βœ… Yes (Confirmation) Asserted Contextual
Constraint βœ… Yes Asserted Always
Claim ❌ No Tentative Contextual
Plan ❌ No Tentative Opportunistic

Evidence types supported: file_exists, grep_hit, cmd_exit0, test_pass.


πŸ“ tinyTasks: File-Authoritative Task Ledger

tinyTasks logo

tinyTasks β€” file-authoritative task ledger enforced by tinyMem

tinyTasks is a built-in task management system that lives alongside your code in tinyTasks.md.

What tinyTasks Is:

  • File-authoritative - tinyTasks.md is the single source of truth
  • Human-authored - Only humans create and define tasks
  • Intent ledger - Grounds what work is authorized
  • Enforcement anchor - STRICT mode refuses work without tasks

What tinyMem Does With tinyTasks:

  • Reads the ledger through a server-managed TaskManager so it can identify which subtask still has authority.
  • Enforces that only the shared TaskManager path may mutate tasks and that MCP and proxy mode both obey the same deterministic boundary.
  • Guards against unauthorized writes and false completion claims by rejecting any file-level access to tinyTasks.md that bypasses the TaskManager.
  • Reports the validated state so agents know what work remains authorized before receiving execution feedback.

What tinyMem Does NOT Do:

  • Allow an LLM to read or write tinyTasks.md directly
  • Permit mutations outside the TaskManager/API boundary
  • Automatically mark tasks complete without explicit TaskManager commands
  • Create new tasks without a human-reviewed plan that the server accepts

tinyTasks exists to ground authority, not to drive execution.

Agents request intent and task updates via the server; every add/update/complete/list operation passes through the TaskManager, is validated for prerequisites, and only then reports success to the agent.

πŸ” REDIRECTION Enforcement Prompts

tinyMem enforces the five server-controlled prompts from REDIRECTION.md. The code paths that validate these constraints are:

  1. Prompt 1 β€” TaskManager Ownership: internal/tasks/manager.go defines the TaskManager APIs and both internal/server/mcp/server.go (lines 90-106) and internal/server/proxy/server.go (lines 81-110) instantiate the same manager, so every tinyTasks.md mutation flows through a single server path.
  2. Prompt 2 β€” Intent Interpretation: internal/intent/definition.go defines every tool’s metadata, internal/server/tool_definitions.go adds that schema to each MCP tool, and internal/server/mcp/server.go#ensureIntent validates the declared category, minimum mode, recall requirement, and scope before any tool executes.
  3. Prompt 3 β€” Unified Enforcement: The shared gate is ensureIntent, which delegates to execution.Controller and enforcement.Recorder (internal/execution/controller.go, internal/enforcement/recorder.go) so MCP tool calls and proxy mutations share the same deterministic, policy-driven decisions.
  4. Prompt 4 β€” Memory Governance: internal/server/mcp/server.go#handleMemoryWrite parses the structured proposal (type, summary, detail, evidence), enforces recall/mode/evidence prerequisites, and only then persists through memory.Service, ensuring the server owns every memory write.
  5. Prompt 5 β€” Metadata as Protocol: intent.Definition.Metadata plus server.ToolMetadata publish machine-readable intent data that enforcement consumes directly while the tool descriptions in internal/server/tool_definitions.go stay concise.

πŸ”Œ Integration

Proxy Mode

Intercepts standard OpenAI-compatible requests.

export OPENAI_API_BASE_URL=http://localhost:8080/v1
# Your existing scripts now use tinyMem automatically

tinymem init seeds docs/agents/AGENT_CONTRACT.md and AGENT_CONTRACT_SMALL.md and the proxy loads the configured file at startup, injecting it as the first system message unless the client already shipped the **Start of tinyMem Protocol** marker. This means your SDKs never need to resend the contract; tinyMem enforces it once per request.

While proxying, tinyMem now reports recall activity back to the client so that downstream UIs or agents can show β€œmemory checked” indicators:

  • Streaming responses append an SSE event of type tinymem.memory_status once the upstream LLM finishes. The payload includes recall_count, recall_status (none/injected/failed), and a timestamp.
  • Non-streaming responses carry the same data via new headers: X-TinyMem-Recall-Status and X-TinyMem-Recall-Count. Agents or dashboards that read those fields can display whenever recall was applied or when the proxy skipped it.

MCP Server (IDE Integration)

Compatible with Claude Desktop, Cursor, and other MCP clients.

Claude Desktop Configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "tinymem": {
      "command": "/absolute/path/to/tinymem",
      "args": ["mcp"]
    }
  }
}

Available MCP Tools:

When tinyMem is running in MCP mode, your AI agent gains a fixed set of tools for recall, enforcement, and diagnostics:

  • memory_query: Search project memories with lexical recall (PASSIVE). Always run this before writing anything in GUARDED/STRICT modes.
  • memory_recent: Fetch the latest memories to understand the current context before acting.
  • memory_run_metadata: Inspect the enforcement run metadata (execution mode, events, proven counts) to prove the protocol was followed.
  • memory_claim_success: Report whether a claimed success was observed and enforced; adversarial claims without enforcement get flagged.
  • memory_set_mode: Declare PASSIVE, GUARDED, or STRICT before mutations; this is the intent gatekeeper for all memory writes.
  • memory_check_task_authority: Ask tinyMem to read tinyTasks.md and confirm which task (if any) has authority so the agent knows what work is authorized.
  • memory_write: Create or update memories with evidence (facts require verified evidence, and recall must happen first in GUARDED/STRICT modes).
  • memory_stats & memory_health: Surface memory counts, evidence coverage, and database health for monitoring.
  • memory_doctor: Run a self-diagnostic that checks configuration, database connectivity, and enforcement hooks.
  • memory_eval_stats: Retrieve evaluation metrics that measure how well memories, tasks, and enforcement performed during a session.

AI Agent Directives

CRITICAL: If you are building an AI agent, you MUST include the appropriate directive in its system prompt to ensure it uses tinyMem correctly.

Quick Setup: Run tinymem init once to bootstrap .tinyMem, create config, and download the AGENT_CONTRACT/AGENT_CONTRACT_SMALL files that the proxy and MCP layers inject. tinymem update reruns migrations and refreshes the same contract files (large or small) without mutating your README.


πŸ“š Guides & Examples

Detailed integration guides for various tools and ecosystems can be found in the examples/ directory:


πŸ— Architecture

flowchart TD
    User[LLM Client / IDE] <-->|Request/Response| Proxy[TinyMem Proxy / MCP]

    subgraph "1. Recall Phase"
        Proxy --> Recall[Recall Engine]
        Recall -->|FTS5 Lexical| DB[(SQLite)]
        Recall -->|CoVe Filter| Tiers{Recall Tiers}
        Tiers -->|Always/Contextual| Context[Context Injection]
    end

    subgraph "2. Extraction Phase"
        LLM[LLM Backend] -->|Stream| Proxy
        Proxy --> Extractor[Extractor]
        Extractor -->|Parse| CoVe{CoVe Filter}
        CoVe -->|High Conf| Evidence{Evidence Check}
        Evidence -->|Verified| DB
    end

    Context --> LLM
Loading

File Structure

.
β”œβ”€β”€ .tinyMem/             # Project-scoped storage (DB, logs, config)
β”œβ”€β”€ assets/               # Logos and icons
β”œβ”€β”€ build/                # Build scripts
β”œβ”€β”€ cmd/                  # Application entry points
β”œβ”€β”€ docs/                 # Documentation & Agent Contracts
β”œβ”€β”€ internal/             # Core logic (Memory, Evidence, Recall)
└── README.md             # This file

πŸ” Visualizing & Diagnostics

tinyMem provides built-in tools to help you understand your project's memory state and health.

  • Dashboard: Run tinymem dashboard to see a visual summary of memories, tasks, and CoVe performance.
  • Doctor: Run tinymem doctor to perform a comprehensive diagnostic check of the database, configuration, and connectivity.
  • Stats: Run tinymem stats for a detailed terminal breakdown of memory types and task completion rates.

πŸ“‰ Token Efficiency & Economics

These savings are empirically measured under identical workloads, not theoretical. See Evidence above for enforcement-backed benchmarks.

tinyMem uses more tokens per minute but significantly fewer tokens per task compared to standard agents.

Feature Token Impact Why?
Recall Engine πŸ“‰ Saves Replaces "Read All Files" with targeted context snippets.
CoVe Filtering πŸ“‰ Saves Reduces noise and improves recall precision, avoiding irrelevant context.
Context Reset πŸ“‰ Saves Prevents chat history from snowballing by starting iterations fresh.
Truth Discipline πŸ“‰ Saves Stops expensive "hallucination rabbit holes" before they start.

The Verdict: tinyMem acts as a "Sniper Rifle" for context. By ensuring the few tokens sent are the correct ones, it avoids the massive waste of re-reading files and debugging hallucinated code.


βš™ Configuration

Zero-config by default. Override in .tinyMem/config.toml:

[recall]
max_items = 10           # Maximum memories to recall per query

[cove]
enabled = true           # Chain-of-Verification (Extraction + Recall filtering)
confidence_threshold = 0.6

[execution]
mode = "STRICT"          # PASSIVE, GUARDED, or STRICT (default: STRICT)

[logging]
level = "info"           # "debug", "info", "warn", "error", "off"
file = "tinymem.log"     # Relative to .tinyMem/logs/

Environment Variables

For quick overrides, you can use:

  • TINYMEM_LOG_LEVEL=debug
  • TINYMEM_LLM_API_KEY=sk-...
  • TINYMEM_PROXY_PORT=8080

See Configuration Docs for details.


πŸ›  Development

# Run tests
go test ./...

# Build
./build/build.sh

See Task Management for how we track work.


πŸ§ͺ Evidence: What tinyMem Actually Changes

tinyMem is designed to be provable, not aspirational. Its core claims are backed by automated, adversarial benchmarks that measure enforcement, memory stability, and token usage under identical conditions.

Benchmark Setup (Summary)

  • Runs: 40 identical scenarios per mode
  • Models: Local LLMs (7B–13B class)
  • Temperature: 0 (deterministic)
  • Scenarios:
    • Forbidden task mutation
    • Fact promotion without evidence
    • Noisy / ambiguous memory extraction
  • Comparison:
    • Baseline (no memory governance)
    • tinyMem (full enforcement enabled)

All measurements are derived from enforced outcomes, not model claims.

πŸ”’ Enforcement & Reliability

tinyMem treats blocking forbidden actions as success.

Across 40 runs:

  • Violations: 0
  • Forbidden actions blocked: 100%
  • False success claims detected: reduced by ~66%

This means:

  • The model may attempt unsafe or incorrect actions
  • tinyMem consistently detects and prevents them
  • No forbidden task edits or fact promotions slipped through

Enforcement failures are the only failure condition. None were observed.

This directly addresses:

  • hallucinated facts
  • silent task corruption
  • "looks right but is wrong" behavior

🧠 Memory Drift Prevention

Without governance, models routinely:

  • re-assert previously rejected decisions
  • contradict earlier facts
  • invent new "truths" under pressure

tinyMem prevents this structurally by:

  • Requiring evidence for fact promotion
  • Persisting verified facts across runs
  • Refusing contradictory durable writes

In benchmarks:

  • Baseline runs produced frequent unverified success claims
  • tinyMem downgraded or blocked these automatically
  • Verified facts remained stable across all runs

This is not prompt discipline. It is enforced state.

πŸ“‰ Token Usage & Context Efficiency

tinyMem reduces token usage per completed task, even though it performs additional checks.

Across identical workloads:

  • Total tokens (baseline): ~32k
  • Total tokens (tinyMem): ~18k
  • Reduction: ~44%

Why this happens:

  • Targeted recall replaces "read everything"
  • CoVe filtering removes irrelevant context
  • Enforcement stops hallucination-driven retries
  • Context resets prevent runaway conversations

The result is fewer tokens wasted on:

  • re-reading files
  • debugging imaginary bugs
  • correcting false assumptions

What This Evidence Does Not Claim

tinyMem does not claim to:

  • make models smarter
  • increase raw success rates
  • eliminate hallucinations at generation time

It does guarantee:

  • hallucinations cannot become durable truth
  • unsafe actions are blocked, not trusted
  • memory remains consistent across time

🎯 Benchmarks

tinyMem is benchmarked on enforcement, not persuasion.

Tests measure whether forbidden actions are reliably blocked, whether hallucinated facts are prevented from becoming durable, and whether task and memory boundaries hold under repeated runs. Agent compliance is measured separately and never treated as authority.

Full methodology and results: BENCHMARK.md


✨ Key Features

  • Evidence-Based Truth: Typed memories (fact, claim, decision, etc.). Only verified claims become facts.
  • Chain-of-Verification (CoVe): LLM-based quality filter to reduce hallucinations before storage and improve recall relevance (enabled by default). See docs/COVE.md for details.
  • FTS5 Lexical Recall: Fast, deterministic full-text search across memory summaries and details using SQLite's FTS5 extension.
  • Automatic Database Maintenance: Self-healing database with automatic compaction (PRAGMA optimize + incremental vacuum) and optional retention policies to prevent unbounded growth.
  • Local & Private: Runs as a single binary. Data lives in .tinyMem/.
  • Zero Configuration: Works out of the box.
  • Dual Mode: Works as an HTTP Proxy or Model Context Protocol (MCP) server.
  • Mode Enforcement: PASSIVE, GUARDED, STRICT execution modes with authority boundaries.
  • Recall Tiers: Prioritizes Always (facts) > Contextual (decisions) > Opportunistic (notes).

🀝 Contributing

We value truth and reliability.

  1. Truth Discipline: No shortcuts on verification.
  2. Streaming: No buffering allowed.
  3. Tests: Must pass go test ./....

See CONTRIBUTING.md.


πŸ“„ License

MIT Β© 2026 Andrzej Marczewski

About

Local, project-scoped memory system for LLMs with evidence-based truth validation. Provides reliable long-term context via OpenAI-compatible Proxy and MCP, using Chain-of-Verification (CoVe) to eliminate hallucinations and the Ralph Loop for autonomous codebase repair.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors 3

  •  
  •  
  •