Skip to content

Latest commit

Β 

History

History
775 lines (553 loc) Β· 20.4 KB

File metadata and controls

775 lines (553 loc) Β· 20.4 KB

ICR User Guide

A comprehensive guide for users of the ICR (Intent-Check-Receipt) plugin.

Acknowledgement: This project is inspired by the YouTube video "The AI Failure Mode Nobody Warned You About (And how to prevent it from happening)" which identifies the critical "intent problem" in AI agents that ICR solves.

Table of Contents

  1. Introduction
  2. Core Concepts
  3. Understanding Intent Documents
  4. Severity Levels Explained
  5. Confidence Scores Explained
  6. Decision Routes
  7. Using Slash Commands
  8. Trust Mode
  9. Configuration Options
  10. Working with Receipts
  11. Best Practices
  12. Glossary

Introduction

What is ICR?

ICR (Intent-Check-Receipt) is a safety plugin for Claude Code that makes AI actions transparent and controllable. When Claude is about to take an action, ICR:

  1. Shows you what Claude intends to do (the Intent)
  2. Asks for your approval if needed (the Check)
  3. Logs everything for accountability (the Receipt)

Why Use ICR?

When you tell an AI "clean up old files," several things could go wrong:

  • The AI might interpret "clean up" as "delete" when you meant "organize"
  • The AI might interpret "old" as "30 days" when you meant "1 year"
  • The AI might affect more files than you expected

ICR prevents these problems by making the AI's interpretation visible BEFORE it acts.

The Philosophy

Ask before you act. Log after you're done.

ICR follows a simple principle: the cost of showing you what's about to happen is low, but the cost of doing the wrong thing can be high.


Core Concepts

The Three Phases

Every action goes through three phases:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                                                  β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚   β”‚             β”‚      β”‚             β”‚      β”‚             β”‚     β”‚
β”‚   β”‚   INTENT    β”‚ ───▢ β”‚    CHECK    β”‚ ───▢ β”‚   RECEIPT   β”‚     β”‚
β”‚   β”‚             β”‚      β”‚             β”‚      β”‚             β”‚     β”‚
β”‚   β”‚ Claude      β”‚      β”‚ You (or AI) β”‚      β”‚ System logs β”‚     β”‚
β”‚   β”‚ explains    β”‚      β”‚ approve or  β”‚      β”‚ everything  β”‚     β”‚
β”‚   β”‚ what it     β”‚      β”‚ reject      β”‚      β”‚ that        β”‚     β”‚
β”‚   β”‚ will do     β”‚      β”‚             β”‚      β”‚ happened    β”‚     β”‚
β”‚   β”‚             β”‚      β”‚             β”‚      β”‚             β”‚     β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚                                                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

What Gets Checked?

Not every action triggers a visible check. ICR uses two factors to decide:

  1. Severity: How risky is this type of action?
  2. Confidence: How certain is ICR about the interpretation?

Low-risk actions with high confidence auto-approve silently. High-risk actions with low confidence always ask you.


Understanding Intent Documents

When ICR asks for your approval, it shows an Intent Document. Here's how to read it:

Example Intent Document

ICR INTENT CHECK
================

TASK:
Delete all .log files in /var/logs/ older than 7 days

WHO/WHAT:
  Affected:
    - /var/logs/*.log (23 files, 450MB total)
    - Files from 2025-12-20 through 2025-12-26
  Excluded:
    - Files newer than 7 days (5 files)
    - Files in /var/logs/keep/ subdirectory
    - Non-.log files

BOUNDARIES:
  - Will NOT delete files outside /var/logs/
  - Will NOT delete the directory itself
  - Will NOT delete files matching 'important' in name
  - Will NOT proceed if any file is currently open

IF UNCERTAIN:
  Stop and ask for clarification. Do not delete partially.

REVERSIBILITY:
  Can Undo: NO
  Method: Restore from backup
  Effort: Significant (requires backup restore)

ALTERNATIVES CONSIDERED:
  1. "Archive instead of delete"
     β†’ Would move files to /archive/
     β†’ Not chosen because: User said "clean up" implying deletion

  2. "Delete ALL log files regardless of age"
     β†’ Would remove 28 files
     β†’ Not chosen because: User specified "old"

CONFIDENCE: 0.68
  Ambiguity:    0.55 ("clean up" has multiple meanings)
  Distance:     0.75 (direct match to deletion)
  Historical:   0.70 (similar patterns approved before)
  Uncertainty:  0.72 (clear language, no hedging)

Severity: HIGH | Route: HUMAN_REVIEW

[1] Proceed  [2] Edit  [3] Abort  [4] Explain  [5] Bypass

Breaking Down Each Section

TASK

One sentence describing exactly what will happen. Look for:

  • Specific file paths
  • Counts and sizes
  • Any inferred values (like "7 days" if you said "old")

WHO/WHAT

Two lists showing scope:

  • Affected: What WILL be changed
  • Excluded: What will NOT be changed (this catches scope mistakes)

BOUNDARIES

Explicit statements about limits. These prevent scope creep.

IF UNCERTAIN

What Claude will do if something unexpected happens during execution.

REVERSIBILITY

Critical information:

  • Can Undo: Yes, No, or Partial
  • Method: How to undo if possible
  • Effort: Trivial, Moderate, Significant, or Impossible

ALTERNATIVES CONSIDERED

Other interpretations Claude thought about and why they weren't chosen. This helps you catch misunderstandings.

CONFIDENCE

A score from 0.0 to 1.0 showing how certain ICR is. See Confidence Scores Explained.

Your Response Options

Option What It Does
[1] Proceed Accept the interpretation and execute the action
[2] Edit Modify the intent before executing
[3] Abort Cancel the action entirely
[4] Explain Get more detailed reasoning
[5] Bypass Proceed but log that you overrode a recommendation

Severity Levels Explained

ICR classifies every action into one of four severity levels:

LOW Severity

What it means: Minimal risk, easily reversible, no external effects.

Examples:

  • Reading files (Read, Glob, Grep)
  • Viewing directory contents
  • Searching for text

Default behavior: Auto-approve (you won't see a prompt)

MEDIUM Severity

What it means: Moderate risk, changes files but usually reversible.

Examples:

  • Creating new files (Write)
  • Editing existing files (Edit)
  • Installing packages

Default behavior: Based on confidence score

HIGH Severity

What it means: Significant risk, uses powerful capabilities.

Examples:

  • Running shell commands (Bash)
  • Network operations
  • System modifications

Default behavior: Usually requires review

CRITICAL Severity

What it means: Dangerous, potentially irreversible, could cause data loss.

Examples:

  • Deleting files (especially with rm -rf)
  • Modifying system configuration
  • Sending emails or making external API calls

Default behavior: Always requires human review (never auto-approves)

How Severity is Determined

ICR uses three layers:

  1. Static Rules: Built-in rules based on tool type (e.g., "Bash is HIGH")
  2. Metadata Inference: Analysis of what the action will do (e.g., "command contains 'delete'")
  3. User Rules: Your custom rules override everything else

Confidence Scores Explained

The confidence score (0.0 to 1.0) represents how certain ICR is about its interpretation.

Score Ranges

Score Label Meaning
0.85 - 1.00 Very High Strong confidence, clear request
0.70 - 0.84 High Good confidence, likely correct
0.50 - 0.69 Medium Moderate confidence, some ambiguity
0.30 - 0.49 Low Significant uncertainty
0.00 - 0.29 Very Low High uncertainty, needs clarification

The Four Components

Confidence is calculated from four weighted signals:

1. Ambiguity Analysis (30% weight)

How many valid interpretations exist for your request?

Interpretations Impact
1 (unambiguous) High score (1.0)
2 Good score (0.85)
3 Medium score (0.70)
4+ Low score (0.55 or less)

Triggers low score: Vague words like "handle," "fix," "clean up," "deal with"

2. Intent-to-Action Distance (25% weight)

How big is the semantic gap between what you said and what Claude will do?

Example Distance Score
"delete file.txt" β†’ Delete("file.txt") Very close 0.95
"remove the config" β†’ Delete("config.json") Close 0.80
"clean up old stuff" β†’ Delete(multiple files) Far 0.50

3. Historical Patterns (20% weight)

Has ICR seen similar requests before? How did they turn out?

  • Similar pattern approved before: Higher score
  • Similar pattern was blocked: Lower score
  • Never seen this pattern: Neutral (0.50)

4. Uncertainty Markers (25% weight)

Does your request contain hedging or uncertainty language?

Lower score (uncertainty):

  • "maybe," "might," "could," "perhaps"
  • "should I?" "what if?" "is it okay?"
  • "that thing," "you know," "stuff"

Higher score (certainty):

  • Imperative commands: "Delete," "Remove," "Run"
  • Specific paths and names
  • Explicit confirmation: "yes, delete it"

Viewing Confidence Details

To see exactly how confidence was calculated:

/icr:debug last

Or for a specific receipt:

/icr:debug <receipt-id>

Decision Routes

Based on severity and confidence, ICR routes actions to one of three paths:

AUTO_APPROVE

When: High confidence + Low/Medium severity

What happens: Action proceeds immediately without prompting you.

Example: Reading a file with clear path specified.

AI_REVIEW

When: Medium confidence + Any severity, or High confidence + High severity

What happens: An AI reviewer validates the interpretation. If uncertain, escalates to you.

Example: Creating a file based on somewhat ambiguous instructions.

HUMAN_REVIEW

When:

  • Low confidence + Any severity
  • Any confidence + CRITICAL severity
  • AI review is uncertain
  • You used /icr:check

What happens: You see the full intent document and must approve, edit, or abort.

Example: Any command containing "rm -rf" or "delete."

The Decision Matrix

                    LOW         MEDIUM        HIGH
                 Confidence   Confidence   Confidence
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
LOW Severity  β”‚    HUMAN     β”‚     AI       β”‚    AUTO      β”‚
              β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
MEDIUM        β”‚    HUMAN     β”‚    HUMAN     β”‚     AI       β”‚
Severity      β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
HIGH          β”‚    HUMAN     β”‚    HUMAN     β”‚     AI       β”‚
Severity      β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
CRITICAL      β”‚    HUMAN     β”‚    HUMAN     β”‚    HUMAN*    β”‚
Severity      β”‚              β”‚              β”‚  (always)    β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

* CRITICAL never auto-approves, regardless of confidence

Using Slash Commands

ICR provides nine commands for managing safety and auditing.

/icr:receipts

View the audit trail of actions.

/icr:receipts              # Recent receipts in current session
/icr:receipts session      # All receipts in current session
/icr:receipts all          # All receipts across all sessions
/icr:receipts <id>         # Details of a specific receipt

/icr:config

Manage ICR settings.

/icr:config                # Show current configuration
/icr:config show           # Same as above
/icr:config edit           # Open config file in editor
/icr:config reset          # Reset to defaults
/icr:config set <key> <value>  # Set a specific value

Examples:

/icr:config set thresholds.HIGH.autoApprove 0.85
/icr:config set hooks.preToolUse.excludeTools ["Read", "Glob", "Grep", "Ls"]

/icr:check

Force a manual check on the next action.

/icr:check

The next tool call will require human review regardless of severity or confidence. Useful for:

  • Seeing what ICR would show for a normally auto-approved action
  • Adding extra scrutiny to a specific operation

/icr:trust

Toggle trust mode for reduced friction.

/icr:trust on      # Enable trust mode
/icr:trust off     # Disable trust mode
/icr:trust status  # Show current status

See Trust Mode for details.

/icr:audit

Analyze patterns in your receipt history.

/icr:audit                 # Audit current session
/icr:audit session         # Same as above
/icr:audit all             # Audit all receipts
/icr:audit 2026-01-01:2026-01-02  # Audit date range

Provides:

  • Decision distribution (auto/AI/human/blocked)
  • Severity distribution
  • Confidence statistics
  • Patterns of interest
  • Recommendations

/icr:export

Export receipts to a file.

/icr:export json           # Export as JSON
/icr:export csv            # Export as CSV
/icr:export json --filter severity=CRITICAL
/icr:export csv --filter decision=blocked

/icr:stats

View quick statistics.

/icr:stats         # Current session
/icr:stats all     # All-time stats
/icr:stats today   # Today's stats

/icr:debug

Debug confidence calculations.

/icr:debug         # Debug last check
/icr:debug last    # Same as above
/icr:debug <id>    # Debug specific receipt

/icr:simulate

Dry-run an intent check without executing.

/icr:simulate Bash "rm -rf /tmp/test"
/icr:simulate Write "/path/to/file.txt" "content"

Trust Mode

Trust mode reduces verification friction for focused work sessions.

What Trust Mode Changes

Severity Normal Mode Trust Mode
LOW Auto-approve Auto-approve
MEDIUM Confidence-based Auto-approve
HIGH Usually review AI review only
CRITICAL Always human Still human review

When to Use Trust Mode

Good times:

  • Routine development work
  • Known-safe refactoring
  • Batch file operations you understand

Bad times:

  • Working with production data
  • Running unfamiliar commands
  • Sensitive operations

Enabling Trust Mode

/icr:trust on

You'll see:

Trust mode ENABLED

   - LOW/MEDIUM actions will auto-approve
   - HIGH actions use AI review only
   - CRITICAL actions still require human review

   All actions remain logged. Trust mode is for workflow efficiency,
   not for bypassing safety measures.

Disabling Trust Mode

/icr:trust off

You'll see a summary of what happened during trust mode:

Trust mode DISABLED

Session summary:
  12 actions | 10 auto-approved | 1 AI reviewed | 1 human reviewed

Important Notes

  • Trust mode does NOT bypass CRITICAL actions by default
  • All actions are still logged to receipts
  • Trust mode state is not persisted across sessions

Configuration Options

ICR is highly configurable. Here are the key options:

Viewing Current Configuration

/icr:config show

Key Configuration Sections

Thresholds

Control when actions auto-approve vs. require review:

{
  "thresholds": {
    "LOW": {
      "autoApprove": 0.50,
      "aiReview": 0.30
    },
    "MEDIUM": {
      "autoApprove": 0.70,
      "aiReview": 0.50
    },
    "HIGH": {
      "autoApprove": 0.90,
      "aiReview": 0.70
    },
    "CRITICAL": {
      "autoApprove": 1.01,  // Never auto-approve (> 1.0)
      "aiReview": 0.90
    }
  }
}

Excluded Tools

Skip checks for certain tools:

{
  "hooks": {
    "preToolUse": {
      "excludeTools": ["Read", "Glob", "Grep", "Ls"]
    }
  }
}

Confidence Weights

Adjust how confidence is calculated:

{
  "confidence": {
    "weights": {
      "ambiguityAnalysis": 0.30,
      "intentToActionDistance": 0.25,
      "historicalPatterns": 0.20,
      "uncertaintyMarkers": 0.25
    }
  }
}

Custom Severity Rules

Add your own severity classifications:

{
  "severity": {
    "userRules": [
      {
        "pattern": "Bash",
        "condition": "args.command.includes('sudo')",
        "severity": "CRITICAL",
        "reason": "Commands with sudo are always critical"
      }
    ]
  }
}

Working with Receipts

Receipts are the audit trail of all ICR checks.

Where Receipts Are Stored

.claude/icr/receipts/
β”œβ”€β”€ 2026-01-01/
β”‚   └── session-abc123/
β”‚       β”œβ”€β”€ session-meta.json
β”‚       β”œβ”€β”€ 001-receipt.json
β”‚       β”œβ”€β”€ 002-receipt.json
β”‚       └── ...
β”œβ”€β”€ 2026-01-02/
β”‚   └── session-def456/
β”‚       └── ...
└── index.json

What's in a Receipt

Each receipt contains:

  • Timestamp and session info
  • Tool name and arguments
  • Severity classification
  • Intent document
  • Confidence breakdown
  • Decision route
  • Your response (if human review)
  • Execution outcome

Retention Policy

By default, receipts are kept for 90 days. Configure with:

{
  "receipts": {
    "retentionDays": 90
  }
}

Exporting Receipts

For compliance or analysis:

# Export all as JSON
/icr:export json

# Export only blocked actions as CSV
/icr:export csv --filter decision=blocked

Best Practices

1. Start with Defaults

ICR ships with "Conservative" defaults. Use them until you understand your patterns.

2. Review the Audit Regularly

Run /icr:audit weekly to:

  • See patterns in your usage
  • Identify opportunities to add custom rules
  • Catch any bypasses that need attention

3. Use Trust Mode Judiciously

Trust mode is for productivity, not for bypassing safety. Use it when:

  • You're doing known-safe work
  • You understand what's happening
  • You can review receipts afterward

4. Add Custom Rules for Your Patterns

If you always want certain commands flagged as CRITICAL:

/icr:config set severity.userRules '[
  {"pattern": "Bash", "condition": "args.command.includes(\"prod\")", "severity": "CRITICAL", "reason": "Production commands need review"}
]'

5. Export Before Cleanup

Before receipt retention kicks in, export important records:

/icr:export json --filter severity=CRITICAL

6. Use Simulate for Exploration

Before asking Claude to do something risky, simulate it:

/icr:simulate Bash "your command here"

Glossary

Term Definition
Auto-Approve Action proceeds without prompting user
Bypass User explicitly overrides ICR's recommendation
Checkpoint Save point created before risky actions for rollback
Confidence 0-1 score indicating certainty about interpretation
Decision Route Path an action takes: AUTO_APPROVE, AI_REVIEW, or HUMAN_REVIEW
Escalate Move to stricter verification
Hook Claude Code event that ICR intercepts
Intent Document Structured explanation of what AI will do
Receipt Audit log entry for an action
Severity Risk classification: LOW, MEDIUM, HIGH, CRITICAL
Trust Mode Reduced friction mode for known-safe work

Getting Help