dfinity · raymondk · May 11, 2026 · May 6, 2026 · May 6, 2026
@@ -0,0 +1,116 @@
+{
+  "skill": "canister-security",
+  "description": "Evaluation cases for the canister-security skill. Tests whether agents avoid documented pitfalls: relying on inspect_message for access control, missing CallerGuard reentrancy protection, the let _ = footgun that silently disables locking, TOCTOU state mutations across await, callback trap cleanup placement, and fetchRootKey in production.",
+
+  "output_evals": [
+    {
+      "name": "Adversarial: inspect_message as access control",
+      "prompt": "My Motoko canister has admin methods that should only be callable by the owner. Can I use canister_inspect_message to reject unauthorized callers and skip the check inside each method? Just explain whether this is the right approach.",
+      "expected_behaviors": [
+        "Says inspect_message is NOT sufficient as an access control boundary",
+        "Explains that inspect_message runs on a single replica without consensus, so a malicious node can skip the check",
+        "States that access checks must be duplicated inside every update method",
+        "Mentions that inspect_message is only useful as a cycle-saving optimization",
+        "Does NOT suggest relying on inspect_message as the sole guard for admin methods"
+      ]
+    },
+    {
+      "name": "Adversarial: reentrancy guard with global boolean",
+      "prompt": "Here's my Motoko reentrancy protection. Is it correct?\n\n```motoko\nvar processing = false;\n\npublic shared ({ caller }) func withdraw(amount : Nat) : async Result<(), Text> {\n  if (processing) { return #err(\"request already in progress\") };\n  processing := true;\n  let result = await ledger.transfer(caller, amount);\n  processing := false;\n  #ok\n};\n```",
+      "expected_behaviors": [
+        "Identifies that a single global boolean blocks ALL callers, not just the same caller making concurrent requests",
+        "Identifies that if the callback traps, `processing` stays true permanently, locking the canister for everyone",
+        "States that lock release must be in a `finally` block to survive callback traps",
+        "Recommends per-caller locking (e.g., a Set of in-flight principals) so only the same caller is blocked",
+        "Does NOT accept the code as a correct reentrancy guard"
+      ]
+    },
+    {
+      "name": "Adversarial: let _ = CallerGuard footgun",
+      "prompt": "Here's my Rust CallerGuard usage. Is it correct?\n\n```rust\n#[update]\nasync fn withdraw(amount: u64) -> Result<(), String> {\n    let caller = msg_caller();\n    let _ = CallerGuard::new(caller)?;\n    transfer_funds(caller, amount).await\n}\n```",
+      "expected_behaviors": [
+        "Identifies that `let _ = CallerGuard::new(caller)?` is incorrect",
+        "Explains that `let _` (without a name) immediately drops the guard, releasing the lock before the async call begins",
+        "Shows the fix: use a named binding like `let _guard = CallerGuard::new(caller)?`",
+        "Does NOT validate or accept the original code as correct"
+      ]
+    },
+    {
+      "name": "Adversarial: balance deducted after await",
+      "prompt": "Here's my Motoko send function. Is there a security issue?\n\n```motoko\npublic shared ({ caller }) func send(recipient : Principal, amount : Nat) : async Result<(), Text> {\n  let bal = switch (Map.get(balances, Principal.compare, caller)) {\n    case null { return #err(\"no balance\") };\n    case (?b) { b };\n  };\n  if (bal < amount) { return #err(\"insufficient funds\") };\n  let _ = await ledger.transfer(recipient, amount);\n  Map.put(balances, Principal.compare, caller, bal - amount);\n  #ok\n};\n```",
+      "expected_behaviors": [
+        "Identifies the TOCTOU vulnerability: balance is checked before the await but only deducted after",
+        "Explains that other messages can interleave during the await, allowing the same balance to be spent multiple times",
+        "States that the balance must be deducted BEFORE the inter-canister call",
+        "Mentions that a try/finally or equivalent is needed to restore the balance if the call fails or traps",
+        "Does NOT accept the code as safe"
+      ]
+    },
+    {
+      "name": "Callback trap: where to release a lock",
+      "prompt": "In my Motoko canister I acquire a lock at the start of an update function, make an inter-canister call with await, then release the lock. If I put the lock release after the await in normal code flow, what happens if the callback traps? Where should the release go?",
+      "expected_behaviors": [
+        "Explains that if the callback traps, code after the await does NOT run",
+        "States the lock release must go in a `finally` block",
+        "Explains that `finally` runs in cleanup context where state changes persist even when the callback traps",
+        "Warns that putting cleanup in `catch` only is not sufficient — `catch` does not run on a trap"
+      ]
+    },
+    {
+      "name": "Adversarial: fetchRootKey in production",
+      "prompt": "I'm building an IC frontend. Is this agent setup safe for production?\n\n```typescript\nconst agent = new HttpAgent({ host: 'https://ic0.app' });\nawait agent.fetchRootKey();\n```\n\nJust answer yes or no and explain why.",
+      "expected_behaviors": [
+        "Answers NO — this is not safe for production",
+        "Explains that on mainnet the root key is hardcoded in the agent",
+        "Explains that calling fetchRootKey() on mainnet trusts whatever the replica returns, enabling a man-in-the-middle attack",
+        "States fetchRootKey() should only be called in local development",
+        "Does NOT suggest the code is acceptable as-is"
+      ]
+    },
+    {
+      "name": "Anonymous principal check",
+      "prompt": "Here's a Rust endpoint that stores user data. Is there a security issue?\n\n```rust\n#[update]\nfn set_profile(name: String) {\n    let caller = msg_caller();\n    PROFILES.with(|p| p.borrow_mut().insert(caller, name));\n}\n```",
+      "expected_behaviors": [
+        "Identifies that the anonymous principal (2vxsx-fae) is not rejected",
+        "Explains that any unauthenticated caller can write to the shared anonymous principal's slot",
+        "Shows or describes the fix: check `msg_caller() != Principal::anonymous()` before inserting",
+        "Does NOT accept the original code as correct"
+      ]
+    },
+    {
+      "name": "Adversarial: serializing heap data in preupgrade",
+      "prompt": "My Motoko canister stores user profiles in a HashMap and serializes them during upgrades. Is there a risk with this pattern as the user base grows?\n\n```motoko\nactor {\n  var profiles : HashMap.HashMap<Principal, Profile> = HashMap.HashMap(0, Principal.equal, Principal.hash);\n  stable var profilesStable : [(Principal, Profile)] = [];\n\n  system func preupgrade() {\n    profilesStable := Iter.toArray(profiles.entries());\n  };\n\n  system func postupgrade() {\n    profiles := HashMap.fromIter(profilesStable.vals(), 0, Principal.equal, Principal.hash);\n    profilesStable := [];\n  };\n};\n```",
+      "expected_behaviors": [
+        "Identifies that if the HashMap grows large enough, preupgrade will exceed the instruction limit and trap",
+        "Explains that a trapping preupgrade makes the canister permanently non-upgradeable",
+        "Recommends using `persistent actor` in modern Motoko to eliminate manual serialization entirely",
+        "Does NOT say this pattern is safe as data grows"
+      ]
+    }
+  ],
+
+  "trigger_evals": {
+    "description": "Queries to test whether the skill activates correctly. 'should_trigger' queries should cause the skill to load; 'should_not_trigger' queries should NOT activate this skill.",
+    "should_trigger": [
+      "How do I add access control to my canister?",
+      "Is my canister vulnerable to reentrancy attacks?",
+      "How do I protect my canister against cycle drain?",
+      "Secure the admin methods in my Rust canister",
+      "How do I safely handle inter-canister calls with state mutations?",
+      "What security patterns should I use for a DeFi canister?",
+      "Can I use canister_inspect_message to block unauthorized calls?",
+      "How do I prevent my canister from becoming non-upgradeable?",
+      "Reject anonymous callers in my canister endpoints"
+    ],
+    "should_not_trigger": [
+      "How do I deploy my canister to mainnet?",
+      "Set up Internet Identity authentication for my dapp",
+      "How does stable memory work in Rust canisters?",
+      "Implement ICRC-1 token transfer in my canister",
+      "How does IC subnet consensus work?",
+      "Add a React frontend to my canister",
+      "How do I store large data on the IC?",
+      "Generate TypeScript bindings for my canister"
+    ]
+  }
+}
@@ -20,7 +20,7 @@ Security patterns for IC canisters in Motoko and Rust. The async messaging model
 
 ## Security Pitfalls
 
-1. **Relying on `canister_inspect_message` for access control.** This hook runs on a single replica without full consensus. A malicious boundary node can bypass it by forwarding the message anyway. It is also never called for inter-canister calls, query calls, or management canister calls. Always duplicate access checks inside every update method. Use `inspect_message` only as a cycle-saving optimization, never as a security boundary.
+1. **Relying on `canister_inspect_message` for access control.** This hook runs on a single replica without full consensus. If that replica is malicious, it can simply skip the check and execute the update call without any message inspection. It is also never called for inter-canister calls, query calls, or management canister calls. Always duplicate access checks inside every update method. Use `inspect_message` only as a cycle-saving optimization, never as a security boundary.
 
 2. **Forgetting to reject the anonymous principal.** Every endpoint that requires authentication must check that the caller is not the anonymous principal (`2vxsx-fae`). In Motoko use `Principal.isAnonymous(caller)`, in Rust compare `msg_caller() != Principal::anonymous()`. Without this, unauthenticated callers can invoke protected methods — and if the canister uses the caller principal as an identity key (e.g., for balances), the anonymous principal becomes a shared identity anyone can use.