Your AI assistant that never forgets.
Anna is a self-hosted AI assistant that runs on your machine and talks to you through your terminal, Telegram, QQ, or Feishu. She keeps every conversation in a local SQLite database, compresses old context automatically so the LLM never hits its limit, and can recover the original detail whenever she needs it.
She supports multiple agents running simultaneously, each with their own personality, model, and provider. Multiple users are handled automatically -- each person gets isolated per-agent memory that persists across sessions.
She also schedules tasks, monitors files, and sends you notifications across channels without waiting for you to ask.
Most AI assistants lose your context. You hit the token limit, the old messages get truncated, and the assistant forgets what you were working on. Start a new chat, re-explain everything, repeat.
Anna solves this with LCM (Lossless Context Management). As conversations grow, older messages get compressed into summaries organized in a DAG. Summaries get condensed into higher-level summaries. But the originals stay in the database. The agent has tools to search its history and drill back into any summary to pull up the full text. You can talk to Anna for weeks and she'll still know what you said on day one.
Beyond memory, there are a few other things worth calling out.
Anna meets you where you are. Terminal, Telegram, QQ, Feishu, all sharing the same session pool and memory. Chat from your laptop in the morning, pick it up on Telegram from your phone in the evening.
She does things on her own. Tell her "remind me every morning at 9am to check my email" and she will. Built-in scheduler, heartbeat file monitoring, push notifications across whatever channels you have connected.
Run multiple agents at once. A coding assistant, a writing partner, a daily planner -- each with its own model, provider, system prompt, and isolated workspace. Switch between them with /agent in Telegram or --agent on the CLI.
Multiple users out of the box. Users are auto-created from platform identity (Telegram user ID, QQ ID, etc). Each user gets per-agent memory stored in the database, so Anna remembers different things about different people.
And the whole thing is a single Go binary with a SQLite database. Your machine, your API keys, nothing leaves your network.
Users (Telegram / QQ / Feishu / Terminal)
|
| /agent to switch agents
v
anna (single binary, your machine)
|
|- Agents (multiple, each with own model/provider/personality)
| |- Workspace (~/.anna/workspaces/{agent-id}/skills/)
| |- 3-layer system prompt (SYSTEM.md -> SOUL.md -> user memory)
| '- LCM Memory (DAG-based context compression)
|
|- Admin Panel (web UI for all configuration)
|- Scheduler (jobs, reminders, heartbeat)
|- Skills (extensible via skills.sh)
'- Notifications (pushes results back to you)
|
v
LLM Provider (Anthropic / OpenAI / any compatible API)
The memory system stores every message in SQLite and organizes summaries into a directed acyclic graph. When the conversation gets long, older messages are grouped and summarized into leaf nodes. Groups of leaf nodes get condensed into higher-level nodes. This happens automatically.
The agent carries a unified memory tool with four actions:
grep-- search messages and summaries by keyworddescribe-- inspect a summary node's metadata and lineageexpand-- drill into a summary to retrieve the source contentuser_memory_update-- update persistent per-user notes across sessions (write-only, injected into system prompt automatically)
When the context window fills up, Anna isn't working with truncated history. She's working with compressed summaries and can pull up specifics on demand. A conversation can be a thousand messages long and she'll still find what she needs.
Anna supports running multiple agents simultaneously. Each agent has:
- Its own model and provider configuration
- An isolated workspace at
~/.anna/workspaces/{agent-id}/skills/ - A system prompt defined in the DB (
settings_agents.system_prompt), overridable by placing aSOUL.mdin the workspace - A 3-layer system prompt: basic system prompt (overridable by
SYSTEM.md), then agent soul (overridable bySOUL.md), then per-user memory from the database
Users are auto-created from platform identity. Each user gets per-agent memory stored in the ctx_agent_memory table, which is injected into the system prompt and updated via the user_memory_update action on the memory tool. Anna remembers different things about different people, per agent.
In Telegram, use /agent to switch between agents. In DMs, your default agent is remembered. In groups, the agent is set per-group. On the CLI, use anna chat --agent <name>.
Four channels, all sharing the same memory:
| Channel | Connection | Streaming | Groups |
|---|---|---|---|
| Terminal | Local TUI (Bubble Tea) | Token-by-token | n/a |
| Telegram | Long polling, no public IP | Draft API | Mention / always / disabled |
| WebSocket | Native Stream API | Mention support | |
| Feishu | WebSocket, no public IP | Edit-in-place | Mention support |
One bot per platform. Agent selection is handled via the /agent command rather than separate bots.
Every channel supports /new, /compact, /model, /agent, /whoami, model switching, access control, and image input.
You don't write cron expressions by hand. You just tell Anna what you need.
"Check the weather in Beijing every morning at 8am" creates a recurring job. "Remind me at 2:30 PM to call the dentist" creates a one-shot timer that cleans up after it fires. Jobs persist across restarts.
There's also a heartbeat mode. Anna polls a markdown file on an interval, uses a cheap fast model to decide if anything needs attention, and only spins up the main model when there's real work. Results get pushed to whatever channels you have connected.
Anna's identity system is DB-backed. No more markdown files to manage by hand.
- Agent soul: stored in
settings_agents.system_prompt, overridable by placing aSOUL.mdin the agent's workspace (~/.anna/workspaces/{agent-id}/) - System prompt: base instructions overridable by
SYSTEM.mdin the workspace - User memory: per-user per-agent notes stored in the
ctx_agent_memorytable, injected into the system prompt automatically
The 3-layer system prompt builds up as: base system prompt, then agent soul, then user memory. Anna updates user memory over time as she learns your name, timezone, and preferences.
Works with Anthropic, OpenAI, and any OpenAI-compatible API (Perplexity, Together.ai, local models via Ollama, etc). Provider configuration is managed through the admin panel.
Environment variables ANTHROPIC_API_KEY and OPENAI_API_KEY still work as fallbacks.
Three model tiers:
model_strongfor hard problemsmodelfor everyday use (the default)model_fastfor cheap checks and gate decisions
The heartbeat system uses the fast model to decide "skip or run" and only calls the default model when there's actual work. Keeps costs down without you having to think about it.
Anna connects to the skills.sh ecosystem:
anna skills search "web scraping"
anna skills install owner/repo@skill-name
anna skills list
anna skills remove skill-nameSearch, install, and manage skills from the CLI or mid-conversation. Each agent has its own skills directory at ~/.anna/workspaces/{agent-id}/skills/.
go install github.com/vaayne/anna@latestOr grab a binary from Releases, or self-update with anna upgrade.
anna onboardThis opens a web admin panel in your browser where you can configure everything: providers, API keys, agents, channels (Telegram, QQ, Feishu), users, scheduled jobs, and settings. All configuration is stored in ~/.anna/anna.db. There are no YAML config files.
anna chat # Terminal chat (default agent)
anna chat --agent helper # Terminal chat with a specific agent
anna gateway # Start daemon (bots + scheduler)
anna gateway --admin-port 8080 # Start daemon with admin panelanna chat gives you a terminal conversation. anna gateway starts all your configured channels and the scheduler. Add --admin-port to expose the admin panel alongside the gateway for runtime configuration.
anna onboard # Open web admin panel to configure anna
anna chat # Interactive terminal chat
anna chat --agent <name> # Chat with a specific agent
anna chat --stream # Pipe stdin, stream to stdout
anna gateway # Start daemon (bots + scheduler)
anna gateway --admin-port <port> # Start daemon with admin panel
anna models list # List available models
anna models set <p/m> # Switch model (e.g. openai/gpt-4o)
anna models search <q> # Search models
anna skills search <q> # Search skills.sh
anna skills install <s> # Install a skill
anna version # Print version
anna upgrade # Self-update to latest release| Document | Description |
|---|---|
| Configuration | Full config reference, admin panel, defaults |
| Deployment | Binary install, Docker, systemd, compose |
| Architecture | System design, packages, providers, tools |
| Models | Tiers, CLI commands, provider setup |
| Memory System | LCM deep dive, DAG structure, retrieval tools |
| Session Compaction | How context compression works |
| Telegram | Bot setup, streaming, groups, access control |
| QQ Bot | Bot setup, webhook, streaming |
| Feishu Bot | Bot setup, WebSocket, streaming |
| Scheduler System | Scheduler system, heartbeat, persistence |
| Plugin System | JavaScript plugins, tools, hooks |
| Notification System | Dispatcher, backends, routing |
mise run build # Build binary -> bin/anna
mise run test # Run tests with -race
mise run lint # golangci-lint
mise run format # gofmt + go mod tidyOr: go build -o anna . && go test -race ./...
MIT