From 94c4db750ce4d50931ac8111be1a2dee689ccd43 Mon Sep 17 00:00:00 2001 From: Marco Walz Date: Tue, 17 Mar 2026 17:36:16 +0100 Subject: [PATCH 1/2] docs: data persistence guide --- docs/guides/backends/data-persistence.md | 415 ++++++++++++++++++++++- 1 file changed, 399 insertions(+), 16 deletions(-) diff --git a/docs/guides/backends/data-persistence.md b/docs/guides/backends/data-persistence.md index 968196d2..df429270 100644 --- a/docs/guides/backends/data-persistence.md +++ b/docs/guides/backends/data-persistence.md @@ -6,19 +6,402 @@ sidebar: icskills: [stable-memory] --- -TODO: Write content for this page. - - -Guide developers through storing data in canisters. This is the how-to companion to concepts/orthogonal-persistence (which explains what and why; this page shows how). Cover: Motoko persistent actors (persistent actor, transient var, let/var persistence, schema evolution rules), Rust stable structures (StableBTreeMap, StableCell, StableLog, MemoryManager, MemoryId partitioning, Storable trait implementations for custom types, #[init]/#[post_upgrade] hook patterns), and the dangerous pre_upgrade heap serialization anti-pattern. Include idempotency patterns for safe data mutation. Show complete code examples for both Rust and Motoko. - - -- Portal: building-apps/canister-management/storage.mdx, best-practices/storage.mdx, best-practices/idempotency.mdx -- icskills: stable-memory -- Examples: daily_planner (both), superheroes (Motoko), photo_gallery (Rust) -- Rust CDK: https://docs.rs/ic-cdk/latest/ic_cdk/ - - -- concepts/orthogonal-persistence -- conceptual background -- guides/canister-management/lifecycle -- upgrade hooks -- languages/rust/stable-structures -- Rust-specific deep dive -- languages/motoko/ -- Motoko persistence patterns +Canister state lives in two places: **heap memory** (fast, temporary, wiped on upgrade) and **stable memory** (persistent, survives upgrades). Any data you care about must end up in stable memory, or it will be lost the next time the canister is deployed. + +This guide shows how to store data durably in both Motoko and Rust. For a conceptual explanation of why stable memory works this way, see [Orthogonal Persistence](../../concepts/orthogonal-persistence.md). + +## Motoko: persistent actor + +In Motoko, use `persistent actor`. All `let` and `var` declarations inside the actor body are automatically persisted across upgrades — no `stable` keyword, no upgrade hooks. + +```motoko +import Map "mo:core/Map"; +import Nat "mo:core/Nat"; +import Text "mo:core/Text"; +import Time "mo:core/Time"; + +persistent actor { + + // Custom type — defined inside the actor body + type User = { + id : Nat; + name : Text; + created : Int; + }; + + // Automatically persisted across upgrades — no "stable" keyword needed + let users = Map.empty(); + var userCounter : Nat = 0; + + // Transient data — resets to 0 on every upgrade + transient var requestCount : Nat = 0; + + public func addUser(name : Text) : async Nat { + let id = userCounter; + Map.add(users, Nat.compare, id, { + id; + name; + created = Time.now(); + }); + userCounter += 1; + requestCount += 1; + id + }; + + public query func getUser(id : Nat) : async ?User { + Map.get(users, Nat.compare, id) + }; + + public query func getUserCount() : async Nat { + Map.size(users) + }; + + // Resets to 0 after every upgrade — use transient for ephemeral state + public query func getRequestCount() : async Nat { + requestCount + }; +} +``` + +**Rules for Motoko persistent actors:** + +- `let` for collections (`Map`, `List`, `Set`) — auto-persisted, no serialization needed +- `var` for simple values (`Nat`, `Text`, `Bool`) — auto-persisted +- `transient var` for caches or counters that should reset on upgrade +- No `pre_upgrade` / `post_upgrade` hooks needed — the runtime handles persistence +- Do not write `stable let` or `stable var` — redundant in `persistent actor` and produces compiler warnings + +### mops.toml + +```toml +[package] +name = "my-project" +version = "0.1.0" + +[dependencies] +core = "2.0.0" +``` + +### Schema evolution rules + +When upgrading a Motoko canister, the type of every persistent field must be compatible with its stored value. Violating this traps the upgrade and data is unrecoverable. + +**Safe changes (always OK):** +- Add new `let` or `var` fields with initial values +- Add new optional record fields (e.g., change `{ name : Text }` to `{ name : Text; email : ?Text }`) + +**Unsafe changes (will trap on upgrade):** +- Remove or rename a persistent field +- Change a field's type (e.g., `Nat` → `Int`) +- Change a non-optional field to a different type + +## Rust: stable structures + +Rust canisters use [`ic-stable-structures`](https://docs.rs/ic-stable-structures/latest/ic_stable_structures/) for persistent storage. The `MemoryManager` partitions stable memory into virtual memories, each backing a separate data structure. Data lives in stable memory from the start — no serialization on upgrade. + +### Cargo.toml + +```toml +[package] +name = "stable_memory_backend" +version = "0.1.0" +edition = "2021" + +[lib] +crate-type = ["cdylib"] + +[dependencies] +ic-cdk = "0.19" +ic-stable-structures = "0.7" +candid = "0.10" +serde = { version = "1", features = ["derive"] } +ciborium = "0.2" +``` + +### Implementing Storable for custom types + +`StableBTreeMap` keys must implement `Storable + Ord`, values must implement `Storable`. Primitive types (`u64`, `bool`, `String`, `Vec`, `Principal`) already implement `Storable`. For custom structs, implement it manually using CBOR serialization: + +```rust +use ic_stable_structures::storable::{Bound, Storable}; +use candid::CandidType; +use serde::{Deserialize, Serialize}; +use std::borrow::Cow; + +#[derive(CandidType, Serialize, Deserialize, Clone)] +struct User { + id: u64, + name: String, + created: u64, +} + +impl Storable for User { + // Prefer Unbounded — avoids breakage when adding new fields. + // Bounded requires a fixed max_size; exceeding it after a schema change + // breaks deserialization of existing data. + const BOUND: Bound = Bound::Unbounded; + + fn to_bytes(&self) -> Cow<'_, [u8]> { + let mut buf = vec![]; + ciborium::into_writer(self, &mut buf).expect("Failed to encode User"); + Cow::Owned(buf) + } + + fn from_bytes(bytes: Cow<'_, [u8]>) -> Self { + ciborium::from_reader(bytes.as_ref()).expect("Failed to decode User") + } +} +``` + +### MemoryManager and stable structures + +```rust +use ic_stable_structures::{ + memory_manager::{MemoryId, MemoryManager, VirtualMemory}, + storable::{Bound, Storable}, + DefaultMemoryImpl, StableBTreeMap, StableCell, +}; +use ic_cdk::{init, post_upgrade, query, update}; +use candid::CandidType; +use serde::{Deserialize, Serialize}; +use std::borrow::Cow; +use std::cell::RefCell; + +type Memory = VirtualMemory; + +// Each structure gets its own MemoryId — NEVER reuse IDs across structures +const USERS_MEM_ID: MemoryId = MemoryId::new(0); +const COUNTER_MEM_ID: MemoryId = MemoryId::new(1); + +thread_local! { + static MEMORY_MANAGER: RefCell> = + RefCell::new(MemoryManager::init(DefaultMemoryImpl::default())); + + static USERS: RefCell> = + RefCell::new(StableBTreeMap::init( + MEMORY_MANAGER.with(|m| m.borrow().get(USERS_MEM_ID)) + )); + + // StableCell for a single value (counter, config, etc.) + static COUNTER: RefCell> = + RefCell::new(StableCell::init( + MEMORY_MANAGER.with(|m| m.borrow().get(COUNTER_MEM_ID)), + 0u64, + ).expect("Failed to init counter")); +} + +#[init] +fn init() { + // One-time initialization — stable structures auto-initialize from above +} + +#[post_upgrade] +fn post_upgrade() { + // Stable structures auto-restore — no deserialization needed here. + // Re-initialize timers or other transient state if needed. +} + +#[update] +fn add_user(name: String) -> u64 { + let id = COUNTER.with(|c| { + let mut cell = c.borrow_mut(); + let current = *cell.get(); + cell.set(current + 1).expect("Counter update failed"); + current + }); + + USERS.with(|users| { + users.borrow_mut().insert(id, User { + id, + name, + created: ic_cdk::api::time(), + }); + }); + + id +} + +#[query] +fn get_user(id: u64) -> Option { + USERS.with(|users| users.borrow().get(&id)) +} + +#[query] +fn get_user_count() -> u64 { + USERS.with(|users| users.borrow().len()) +} + +ic_cdk::export_candid!(); +``` + +### Multiple stable structures + +When using more than one stable structure, give each a unique `MemoryId`. `StableLog` requires two memory regions (index + data): + +```rust +use ic_stable_structures::{StableBTreeMap, StableCell, StableLog}; + +// Assign one MemoryId per structure — never reuse +const USERS_MEM_ID: MemoryId = MemoryId::new(0); +const POSTS_MEM_ID: MemoryId = MemoryId::new(1); +const COUNTER_MEM_ID: MemoryId = MemoryId::new(2); +const LOG_INDEX_MEM_ID: MemoryId = MemoryId::new(3); // StableLog needs two +const LOG_DATA_MEM_ID: MemoryId = MemoryId::new(4); + +thread_local! { + static MEMORY_MANAGER: RefCell> = + RefCell::new(MemoryManager::init(DefaultMemoryImpl::default())); + + static USERS: RefCell> = + RefCell::new(StableBTreeMap::init( + MEMORY_MANAGER.with(|m| m.borrow().get(USERS_MEM_ID)) + )); + + static POSTS: RefCell> = + RefCell::new(StableBTreeMap::init( + MEMORY_MANAGER.with(|m| m.borrow().get(POSTS_MEM_ID)) + )); + + static COUNTER: RefCell> = + RefCell::new(StableCell::init( + MEMORY_MANAGER.with(|m| m.borrow().get(COUNTER_MEM_ID)), + 0u64, + ).expect("Failed to init counter")); + + static AUDIT_LOG: RefCell, Memory, Memory>> = + RefCell::new(StableLog::init( + MEMORY_MANAGER.with(|m| m.borrow().get(LOG_INDEX_MEM_ID)), + MEMORY_MANAGER.with(|m| m.borrow().get(LOG_DATA_MEM_ID)), + ).expect("Failed to init audit log")); +} +``` + +**Rules for Rust stable structures:** + +- Each structure gets a unique `MemoryId` — reusing IDs corrupts both structures +- `StableBTreeMap` for keyed collections; keys need `Storable + Ord` +- `StableCell` for single values (counters, config flags) +- `StableLog` for append-only logs — requires two `MemoryId`s (index + data) +- `thread_local! { RefCell> }` is the correct pattern — `RefCell` wraps the stable structure, not a heap `HashMap` +- No `pre_upgrade`/`post_upgrade` serialization needed — data is already in stable memory + +## Anti-pattern: pre_upgrade serialization (Rust) + +Avoid serializing heap data to stable memory in `pre_upgrade` hooks. This pattern is fragile and will brick the canister under load: + +```rust +// DO NOT DO THIS +#[pre_upgrade] +fn pre_upgrade() { + // If STATE is large, this hits the instruction limit and traps. + // A trapped pre_upgrade bricks the canister — the upgrade cannot complete + // and the canister is stuck on the old code with no way to recover. + let state = STATE.with(|s| s.borrow().clone()); + ic_cdk::storage::stable_save((state,)).unwrap(); +} + +#[post_upgrade] +fn post_upgrade() { + let (state,) = ic_cdk::storage::stable_restore().unwrap(); + STATE.with(|s| *s.borrow_mut() = state); +} +``` + +Use `StableBTreeMap` and other stable structures instead. Data lives in stable memory from the start, so no serialization step is needed on upgrade. + +## Idempotency for safe data mutation + +When an update call's result is unknown (network interruption, ingress expiry), callers may retry. Without idempotency, retries can cause double-writes, double-spends, or duplicate records. Two patterns handle this: + +### Sequence numbers + +Track a per-caller counter. A call is only accepted if it carries the next expected sequence number: + +```motoko +// Motoko example — persistent actor +var callerSeq = Map.empty(); + +public shared(msg) func transferWithSeq(amount : Nat, seq : Nat) : async Bool { + let caller = msg.caller; + let expected = switch (Map.get(callerSeq, Principal.compare, caller)) { + case null 0; + case (?n) n; + }; + if (seq != expected) return false; // reject out-of-order or duplicate calls + // ... perform transfer ... + Map.add(callerSeq, Principal.compare, caller, seq + 1); + true +}; +``` + +Best for low-throughput, per-account flows (similar to Ethereum nonces). Limits concurrency to one in-flight call per caller. + +### ID deduplication + +Callers attach a unique ID per operation. The canister rejects duplicates within a time window: + +```motoko +// Motoko example — persistent actor +import Time "mo:core/Time"; + +type DedupeEntry = { executed_at : Int }; +let executed = Map.empty(); +let WINDOW_NS : Int = 24 * 60 * 60 * 1_000_000_000; // 24 hours in nanoseconds + +public func transferWithId(amount : Nat, idempotency_key : Text) : async Bool { + let now = Time.now(); + switch (Map.get(executed, Text.compare, idempotency_key)) { + case (?entry) { + if (now - entry.executed_at < WINDOW_NS) return true; // already done + }; + case null {}; + }; + // ... perform transfer ... + Map.add(executed, Text.compare, idempotency_key, { executed_at = now }); + true +}; +``` + +Supports higher throughput and concurrent callers. Requires bounded storage — expire entries after the deduplication window. + +## Verify persistence across upgrades + +The definitive test: deploy, write data, upgrade, confirm data survived. + +```bash +icp network start -d +icp deploy backend + +# Write some data +icp canister call backend addUser '("Alice")' +icp canister call backend addUser '("Bob")' + +# Record the count +icp canister call backend getUserCount '()' +# Returns: (2 : nat) + +# Upgrade the canister (redeploy with code change) +icp deploy backend + +# Data must still be there +icp canister call backend getUserCount '()' +# Must still return: (2 : nat) + +icp canister call backend getUser '(0)' +# Returns: (opt record { id = 0 : nat; name = "Alice"; ... }) + +# Transient state resets +icp canister call backend getRequestCount '()' +# Returns: (0 : nat) — expected, transient var resets on upgrade +``` + +If the count drops to 0 after upgrade, the data is not in stable memory. Review your storage declarations. + +## Related + +- [Orthogonal Persistence](../../concepts/orthogonal-persistence.md) — conceptual explanation of heap vs. stable memory +- [Canister Lifecycle](../canister-management/lifecycle.md) — upgrade hooks and canister lifecycle +- [Stable Structures (Rust)](../../languages/rust/stable-structures.md) — deep dive into `ic-stable-structures` +- [Motoko](../../languages/motoko/index.md) — Motoko language overview and persistence model + + From 32a6fcc3b8d5644af1c845a3585f5a89c949e35f Mon Sep 17 00:00:00 2001 From: Marco Walz Date: Tue, 17 Mar 2026 18:07:44 +0100 Subject: [PATCH 2/2] fix: address review feedback on data persistence guide - Rephrase intro to correctly distinguish heap behavior: wiped on upgrade in Rust, automatically preserved in Motoko persistent actor - Fix schema evolution wording: upgrade traps leave canister on old Wasm with data intact; not "unrecoverable" - Fix Bounded vs Unbounded comment: exceeding max_size traps writes, does not break deserialization of existing data - Mention skip_pre_upgrade as emergency recovery option for trapped pre_upgrade hooks - Make "Multiple stable structures" snippet self-contained with full imports, Memory alias, and Post struct definition - Split CLI persistence test into Motoko (camelCase) and Rust (snake_case) variants to match each backend's exported method names - Add missing imports to both idempotency snippets (Map, Principal, Nat for sequence numbers; Map, Text for ID deduplication) --- docs/guides/backends/data-persistence.md | 78 +++++++++++++++++++++--- 1 file changed, 70 insertions(+), 8 deletions(-) diff --git a/docs/guides/backends/data-persistence.md b/docs/guides/backends/data-persistence.md index df429270..aad5bebf 100644 --- a/docs/guides/backends/data-persistence.md +++ b/docs/guides/backends/data-persistence.md @@ -6,7 +6,7 @@ sidebar: icskills: [stable-memory] --- -Canister state lives in two places: **heap memory** (fast, temporary, wiped on upgrade) and **stable memory** (persistent, survives upgrades). Any data you care about must end up in stable memory, or it will be lost the next time the canister is deployed. +Canister state lives in two places: **heap memory** and **stable memory** (persistent, survives upgrades). In Rust and most languages, heap memory is wiped on upgrade — any data you care about must be stored in stable memory. In Motoko, the `persistent actor` pattern automatically preserves all actor state across upgrades without any additional work. This guide shows how to store data durably in both Motoko and Rust. For a conceptual explanation of why stable memory works this way, see [Orthogonal Persistence](../../concepts/orthogonal-persistence.md). @@ -84,7 +84,7 @@ core = "2.0.0" ### Schema evolution rules -When upgrading a Motoko canister, the type of every persistent field must be compatible with its stored value. Violating this traps the upgrade and data is unrecoverable. +When upgrading a Motoko canister, the type of every persistent field must be compatible with its stored value. Violating this causes the upgrade to trap — the canister continues running on the old Wasm with its data intact, but cannot be upgraded until the type conflict is resolved. **Safe changes (always OK):** - Add new `let` or `var` fields with initial values @@ -137,8 +137,10 @@ struct User { impl Storable for User { // Prefer Unbounded — avoids breakage when adding new fields. - // Bounded requires a fixed max_size; exceeding it after a schema change - // breaks deserialization of existing data. + // Bounded requires a fixed max_size; if the encoded size of a value + // exceeds max_size after a schema change, writes will trap. + // Existing stored data is unaffected, but no new or updated records + // can be written until the type fits within the declared max_size. const BOUND: Bound = Bound::Unbounded; fn to_bytes(&self) -> Cow<'_, [u8]> { @@ -236,10 +238,26 @@ ic_cdk::export_candid!(); ### Multiple stable structures -When using more than one stable structure, give each a unique `MemoryId`. `StableLog` requires two memory regions (index + data): +When using more than one stable structure, give each a unique `MemoryId`. `StableLog` requires two memory regions (index + data). + +This example extends the [MemoryManager and stable structures](#memorymanager-and-stable-structures) snippet above — it reuses the same `Memory` type alias, `MemoryManager`, `DefaultMemoryImpl`, `RefCell`, and `User` struct defined there, and adds `Post` and `AUDIT_LOG`: ```rust -use ic_stable_structures::{StableBTreeMap, StableCell, StableLog}; +use ic_stable_structures::{ + memory_manager::{MemoryId, MemoryManager, VirtualMemory}, + DefaultMemoryImpl, StableBTreeMap, StableCell, StableLog, +}; +use candid::CandidType; +use serde::{Deserialize, Serialize}; +use std::cell::RefCell; + +type Memory = VirtualMemory; + +#[derive(CandidType, Serialize, Deserialize, Clone)] +struct Post { + id: u64, + content: String, +} // Assign one MemoryId per structure — never reuse const USERS_MEM_ID: MemoryId = MemoryId::new(0); @@ -294,8 +312,11 @@ Avoid serializing heap data to stable memory in `pre_upgrade` hooks. This patter #[pre_upgrade] fn pre_upgrade() { // If STATE is large, this hits the instruction limit and traps. - // A trapped pre_upgrade bricks the canister — the upgrade cannot complete - // and the canister is stuck on the old code with no way to recover. + // A trapped pre_upgrade prevents the upgrade from completing — + // the canister is stuck on the old code. Recovery is possible via + // the skip_pre_upgrade flag (which bypasses the hook at the cost of + // losing any state it would have serialized), but it's an emergency + // measure. Avoid this pattern entirely. let state = STATE.with(|s| s.borrow().clone()); ic_cdk::storage::stable_save((state,)).unwrap(); } @@ -319,6 +340,12 @@ Track a per-caller counter. A call is only accepted if it carries the next expec ```motoko // Motoko example — persistent actor +import Map "mo:core/Map"; +import Nat "mo:core/Nat"; +import Principal "mo:core/Principal"; + +persistent actor { + var callerSeq = Map.empty(); public shared(msg) func transferWithSeq(amount : Nat, seq : Nat) : async Bool { @@ -332,6 +359,8 @@ public shared(msg) func transferWithSeq(amount : Nat, seq : Nat) : async Bool { Map.add(callerSeq, Principal.compare, caller, seq + 1); true }; + +} ``` Best for low-throughput, per-account flows (similar to Ethereum nonces). Limits concurrency to one in-flight call per caller. @@ -342,8 +371,12 @@ Callers attach a unique ID per operation. The canister rejects duplicates within ```motoko // Motoko example — persistent actor +import Map "mo:core/Map"; +import Text "mo:core/Text"; import Time "mo:core/Time"; +persistent actor { + type DedupeEntry = { executed_at : Int }; let executed = Map.empty(); let WINDOW_NS : Int = 24 * 60 * 60 * 1_000_000_000; // 24 hours in nanoseconds @@ -360,6 +393,8 @@ public func transferWithId(amount : Nat, idempotency_key : Text) : async Bool { Map.add(executed, Text.compare, idempotency_key, { executed_at = now }); true }; + +} ``` Supports higher throughput and concurrent callers. Requires bounded storage — expire entries after the deduplication window. @@ -368,6 +403,8 @@ Supports higher throughput and concurrent callers. Requires bounded storage — The definitive test: deploy, write data, upgrade, confirm data survived. +**Motoko backend** (method names are camelCase): + ```bash icp network start -d icp deploy backend @@ -395,6 +432,31 @@ icp canister call backend getRequestCount '()' # Returns: (0 : nat) — expected, transient var resets on upgrade ``` +**Rust backend** (method names are snake_case): + +```bash +icp network start -d +icp deploy backend + +# Write some data +icp canister call backend add_user '("Alice")' +icp canister call backend add_user '("Bob")' + +# Record the count +icp canister call backend get_user_count '()' +# Returns: (2 : nat64) + +# Upgrade the canister (redeploy with code change) +icp deploy backend + +# Data must still be there +icp canister call backend get_user_count '()' +# Must still return: (2 : nat64) + +icp canister call backend get_user '(0 : nat64)' +# Returns: (opt record { id = 0 : nat64; name = "Alice"; created = ... }) +``` + If the count drops to 0 after upgrade, the data is not in stable memory. Review your storage declarations. ## Related