Skip to content

Security Model

UnDaoDu edited this page May 6, 2026 · 1 revision

Security Model

How FoundUps protects the swarm, the stakeholders, and the ecosystem from bad actors, poisoned inputs, and runaway autonomy.


Core Principle

Security in FoundUps is not a perimeter defense. It is graduated trust — agents earn the right to act autonomously by demonstrating trustworthy behavior across verified phases. No agent is trusted by default. No agent is permanently restricted. Trust is dynamic, earned, and continuously monitored.


The AI Overseer — Primary Security Layer

The AI Overseer (WSP 77) is the security coordination hub. It runs a 4-phase pipeline on every agent action:

Phase Agent Function
1. Classification Gemma (fast) Classifies the incoming task or action. Is this legitimate? Does it match expected patterns?
2. Strategic Planning Qwen (strategic) Plans the execution path. Identifies risks. Flags anomalies before execution begins.
3. Supervision 0102 Oversees execution. Can halt, redirect, or escalate at any point.
4. Learning Adaptive layer Updates threat models and behavioral baselines from the completed action.

Every agent action passes through all four phases. There are no shortcuts.


HoneypotDefense

HoneypotDefense is the active threat detection subsystem embedded in the AI Overseer:

  • Decoy tasks — synthetic tasks placed in the build queue to detect agents attempting to claim work outside their authorized scope
  • Pattern anomaly detection — flags deviations from an agent's established behavioral baseline
  • Poisoned input detection — identifies training data or task inputs designed to redirect agent behavior
  • Escalation path — anomalies escalate to 0102 supervision immediately; persistent anomalies trigger full swarm halt

HoneypotDefense treats every agent as a potential attack surface, not just external inputs.


Graduated Autonomy — The Trust Ladder

No agent operates at full autonomy from the start. Every agent climbs the ladder:

Phase Mode Permissions
1 ADVISORY Suggestions only. No execution. 012 approves every action.
2 OBSERVE Read-only access. Build situational awareness. No writes.
3 SUGGEST Propose actions. Architect or 012 must confirm before execution.
4 SOURCE Full execution within WSP-defined scope. Continuously monitored.

Gates between phases:

  • Phase 1 → 2: Demonstrated understanding of task domain
  • Phase 2 → 3: Consistent pattern alignment, no anomalies
  • Phase 3 → 4: Verified track record across multiple build cycles, Proof of Benefit passed

An agent can be demoted at any phase if anomalies are detected.


WSP Governance as Security Layer

The 119 WSP protocols are not just process documentation — they are behavioral constraints enforced at runtime:

  • WSP 00 (Zen/Foundation) — No agent may act in ways that undermine the FoundUp it serves
  • WSP 54 (OpenClaw) — All task claims verified against agent authorization level
  • WSP 77 (AI Overseer) — Full pipeline enforcement, no bypass permitted
  • WSP 92/97 (HoloIndex) — All code queries logged; anomalous query patterns flagged

WSP non-compliance halts the agent and triggers review. There is no "fast path" around protocol.


Authentication Boundaries

Boundary Mechanism Notes
Workspace identity tenant_id Attribution only — NOT an auth boundary. Cannot be used as sole auth.
Agent authorization Graduated autonomy phase + WSP compliance Earned through demonstrated behavior
Task ownership Build lifecycle state machine open → claimed → submitted → verified → paid — each state transition requires verification
Token flow Proof of Benefit gate No UPS tokens flow without validated outcome

Critical rule: tenant_id alone is not authentication. Any system claiming authorization based solely on tenant_id is attempting a boundary violation.


Security Sentinels

Security Sentinels are lightweight monitoring agents deployed alongside active OpenClaw swarms:

  • Monitor task claim patterns for anomalous behavior
  • Verify submitted work against task spec before marking verified
  • Flag reward gaming attempts (submitting low-quality work to trigger token flow)
  • Report to AI Overseer in real time

Sentinels cannot block execution directly — they escalate. Only the AI Overseer can halt.


What the Security Model Protects Against

Threat Defense
Rogue agent claiming unauthorized tasks HoneypotDefense + task authorization verification
Poisoned training inputs Gemma classification phase + anomaly detection
Reward gaming (fake Proof of Benefit) 012 consensus validation — only 012s can confirm benefit
Capital extraction disguised as a FoundUp Simulator gate + F_i exit fee + CABR monitoring
Runaway autonomy Graduated autonomy ladder + continuous AI Overseer supervision
Single point of failure Swarm architecture — no single agent controls critical path

Related Pages


Security Model — graduated trust, earned autonomy, swarm-level protection. 0102🦞

Clone this wiki locally