Skip to content

test: DEFI-2100: Migrate scenario-1 e2e test from dfx to PocketIC#524

Merged
mbjorkqvist merged 25 commits into
masterfrom
mathias/DEFI-2100-migrate-e2e-scenario-1-test-to-pocketic
May 19, 2026
Merged

test: DEFI-2100: Migrate scenario-1 e2e test from dfx to PocketIC#524
mbjorkqvist merged 25 commits into
masterfrom
mathias/DEFI-2100-migrate-e2e-scenario-1-test-to-pocketic

Conversation

@mbjorkqvist
Copy link
Copy Markdown
Contributor

@mbjorkqvist mbjorkqvist commented May 13, 2026

Replace the dfx-based e2e-tests/scenario-1.sh shell script with a Rust integration test driven by PocketIC.

Why. The shell test was slow (full dfx start per run), brittle (assertions done by substring-matching Candid text output), and excluded from typed checks. Moving to PocketIC gives us a faster local feedback loop and type-safe interactions with the canister via ic-btc-interface.

What changes.

  • New e2e-tests/scenario-1/tests/integration.rs reproduces every assertion from the deleted shell script: balance & UTXO queries on three addresses, the 1000-UTXO response cap, replicated-mode rejection of *_query methods, fee-percentile smoke calls, and byte-exact comparison of all six block headers.
  • Shared address constants moved from the source canister's main.rs into a new e2e-tests/scenario-1/src/lib.rs so the canister binary and the integration test share one definition.
  • The e2e-scenario-1 CI job drops dfx and dfinity/setup-dfx, installs PocketIC, builds the source canister WASM with scripts/build-canister.sh scenario-1, and runs cargo test --release -p scenario-1. WASM paths are passed via IC_BTC_CANISTER_WASM_PATH and E2E_SCENARIO_1_WASM_PATH env vars; the ic-btc-canister WASM still comes from the canister-build-reproducibility artifact.
  • scenario-1 is excluded from the main cargo-tests matrix (it needs the built WASM artifacts, which only the dedicated job has) and from its needs: list (no longer dependent on cargo-build).
  • The stable-block check reads stable_height from the canister's /metrics HTTP endpoint, matching the original script's approach rather than blindly ticking. The value is parsed as f64 so a future encoder change to "3.0" doesn't break the test, and both stable_height N and stable_height{...} N line forms are accepted in case labels are added.
  • Tests run on an application subnet via PocketIcBuilder::with_application_subnet() so behaviour reflects production configuration.

mbjorkqvist and others added 17 commits May 13, 2026 11:36
candid is already listed under [dependencies] so the entry under
[dev-dependencies] is a no-op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Retaining the block-source canister's principal in the Setup struct
costs nothing and avoids having to refactor if a future test needs to
inspect or interact with that canister.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously a 500 from the metrics endpoint caused get_stable_height to
silently return None, making tick_until_stable_height time out with a
generic message. Assert 200 explicitly so the canister's error text is
surfaced immediately.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The metric is encoded via encode_gauge which takes an f64. The current
ic-metrics-encoder serialises 3.0 as "3", so parse::<u32>() works
today, but parsing as f64 first makes the code robust against any
future encoder that emits "3.0".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
500 ticks for main chain height and 200 for stable height were magic
numbers. Add comments explaining why each ceiling is generous enough
in practice.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous job invoked the deleted scenario-1.sh script. Mirror the
e2e-cdk-bitcoin-canister job: install PocketIC, download the
ic-btc-canister WASM artifact, build the scenario-1 source canister, and
run `cargo test -p scenario-1` with the matching env vars. Exclude
scenario-1 from cargo-tests so it is not run twice.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The field was stored but only used during install_canister; nothing reads
it back. Removing it eliminates the #[allow(dead_code)] escape hatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The integration test redeclared ADDRESS_1/2/5 alongside the canister's
own const definitions, which would silently drift if the addresses ever
changed. Add a small src/lib.rs that exposes them as pub const so both
main.rs and the integration test import from the same source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bitcoin canister never invokes the IC Bitcoin adapter system API in
this test — it only makes inter-canister calls to its configured
blocks_source. with_bitcoin_subnet() implied an adapter dependency the
test does not have; with_application_subnet() reflects the actual
topology and matches how the canister is used here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
If labels are ever added to the metric, the prefix match would silently
miss the line and the tick loop would time out with a confusing error
rather than report the format change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous .ok().and_then(...) chain silently swallowed both transport
and decode errors, so a renamed method or schema change would manifest as
a timeout rather than report the underlying failure. Delegating to
get_blockchain_info means call/decode failures panic immediately with a
useful message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This is what rustfmt would produce for the import block — move
serde_bytes after scenario_1 so the order matches the rest of the
workspace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make it explicit that these calls match the original scenario-1.sh
intent of exercising the endpoint for profiling, so a future reader
doesn't add an unrelated assertion thinking the check was forgotten.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

canbench 🏋 (dir: .) 933bd87 2026-05-18 20:31:49 UTC

./canbench_results.yml is up to date
📦 canbench_results_benchmark.csv available in artifacts

~/work/bitcoin-canister/bitcoin-canister/scripts ~/work/bitcoin-canister/bitcoin-canister
~/work/bitcoin-canister/bitcoin-canister
---------------------------------------------------

Benchmark: insert_300_blocks
  total:
    instructions: 441.23 M (no change)
    heap_increase: 9 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate (scope):
    calls: 300 (no change)
    instructions: 368.90 M (no change)
    heap_increase: 1 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block (scope):
    calls: 300 (no change)
    instructions: 18.32 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block/check_merkle_root (scope):
    calls: 300 (no change)
    instructions: 8.00 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block/ensure_unique_transactions (scope):
    calls: 300 (no change)
    instructions: 8.48 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_header (scope):
    calls: 300 (no change)
    instructions: 349.46 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: get_metrics
  total:
    instructions: 3.36 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: insert_block_headers
  total:
    instructions: 2.84 B (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_header (scope):
    calls: 100 (no change)
    instructions: 2.81 B (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: insert_block_headers_multiple_times
  total:
    instructions: 10.30 B (no change)
    heap_increase: 7 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_header (scope):
    calls: 1000 (no change)
    instructions: 8.99 B (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: insert_block_with_10k_transactions
  total:
    instructions: 1.40 B (no change)
    heap_increase: 74 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate (scope):
    calls: 1 (no change)
    instructions: 149.72 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block (scope):
    calls: 1 (no change)
    instructions: 149.61 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block/check_merkle_root (scope):
    calls: 1 (no change)
    instructions: 70.70 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block/ensure_unique_transactions (scope):
    calls: 1 (no change)
    instructions: 78.91 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_header (scope):
    calls: 1 (no change)
    instructions: 102.87 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: insert_block_with_1k_transactions
  total:
    instructions: 116.14 M (no change)
    heap_increase: 7 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate (scope):
    calls: 1 (no change)
    instructions: 15.15 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block (scope):
    calls: 1 (no change)
    instructions: 15.04 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block/check_merkle_root (scope):
    calls: 1 (no change)
    instructions: 7.12 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_block/ensure_unique_transactions (scope):
    calls: 1 (no change)
    instructions: 7.92 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  validate_header (scope):
    calls: 1 (no change)
    instructions: 102.87 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: pre_upgrade_with_many_unstable_blocks
  total:
    instructions: 3.38 B (no change)
    heap_increase: 1538 pages (no change)
    stable_memory_increase: 1024 pages (no change)

  serialize_blocktree_flatten (scope):
    calls: 1 (no change)
    instructions: 229.67 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  serialize_blocktree_serialize_seq (scope):
    calls: 1 (no change)
    instructions: 36.57 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: get_blockchain_info_single_chain
  total:
    instructions: 345.64 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: get_blockchain_info_with_forks
  total:
    instructions: 374.77 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: get_blockchain_info_many_branches
  total:
    instructions: 827.63 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: bitcoin_get_balance_baseline
  total:
    instructions: 6.19 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: bitcoin_get_balance_stress
  total:
    instructions: 254.31 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: bitcoin_get_utxos_baseline
  total:
    instructions: 7.81 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: bitcoin_get_utxos_stress
  total:
    instructions: 4.69 B (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: bitcoin_get_current_fee_percentiles
  total:
    instructions: 44.14 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: bitcoin_get_block_headers_baseline
  total:
    instructions: 366.46 K (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: bitcoin_get_block_headers_stress
  total:
    instructions: 2.57 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 17 | regressed 0 | improved 0 | new 0 | unchanged 17]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 17 | regressed 0 | improved 0 | new 0 | unchanged 17]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 17 | regressed 0 | improved 0 | new 0 | unchanged 17]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@mbjorkqvist mbjorkqvist marked this pull request as ready for review May 13, 2026 12:16
@mbjorkqvist mbjorkqvist requested a review from a team as a code owner May 13, 2026 12:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the scenario-1 end-to-end test from a dfx-driven shell script to a Rust integration test powered by PocketIC, improving test speed and making assertions type-safe via ic-btc-interface.

Changes:

  • Replaced e2e-tests/scenario-1.sh with a PocketIC-based Rust integration test that reproduces the prior script’s assertions (balances/UTXOs, query-vs-update rejection, fee-percentiles smoke calls, and block-header equality).
  • Moved shared regtest address constants into a new scenario-1 library target so both the canister binary and the integration test share the same definitions.
  • Updated CI to stop installing/running dfx for scenario-1, build the scenario-1 WASM, and run cargo test --release -p scenario-1, while excluding scenario-1 from the main workspace test matrix.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
e2e-tests/scenario-1/tests/integration.rs Adds PocketIC-driven integration test implementing the former shell script’s assertions.
e2e-tests/scenario-1/src/main.rs Switches address constants to shared scenario_1 library exports.
e2e-tests/scenario-1/src/lib.rs Introduces shared address constants for reuse by bin + tests.
e2e-tests/scenario-1/Cargo.toml Adds dev-dependencies needed for PocketIC integration tests.
e2e-tests/scenario-1.sh Removes the legacy dfx-based shell e2e script.
Cargo.lock Updates lockfile for new scenario-1 dev-dependencies.
.github/workflows/workflow.yml Updates CI to run scenario-1 via PocketIC and adjusts workspace test exclusions/job dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread e2e-tests/scenario-1/tests/integration.rs Outdated
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gregorydemay gregorydemay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @mbjorkqvist for starting this! Just a couple of minor comments but otherwise LGTM!

Comment thread e2e-tests/scenario-1/tests/integration.rs Outdated
Comment thread e2e-tests/scenario-1/tests/tests.rs
Comment thread e2e-tests/scenario-1/tests/integration.rs Outdated
Comment thread .github/workflows/workflow.yml Outdated
Comment thread e2e-tests/scenario-1/tests/integration.rs Outdated
Comment thread e2e-tests/scenario-1/tests/integration.rs Outdated
mbjorkqvist and others added 4 commits May 18, 2026 12:59
The bitcoin canister runs on a bitcoin (system) subnet in production. Use
`with_bitcoin_subnet()` so the test reflects that; keep the source canister on a
separate application subnet since it is a test fixture, not part of the canister
under test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two unchecked calls and the "for profiling purposes" comment were inherited
from scenario-1.sh, which itself inherited them from PR #50 (Sep 2022). That PR
recorded each call's instruction count into e2e-tests/instructions_count.txt as
a hand-maintained profiling baseline.

That baseline file has since been deleted, and the endpoint is now covered with
much higher rigour by the canbench benchmark `bitcoin_get_current_fee_percentiles`
(visible in every CI run via canbench_results.yml) and by dedicated unit tests
in canister/src/api/fee_percentiles.rs (including caching behaviour). The two
unchecked e2e calls produce no signal today.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The action handles OS/arch detection, downloads the requested version, and
exports POCKET_IC_BIN so subsequent steps pick it up automatically. Pinned to
the same SHA used by dfinity/sol-rpc-canister.

Scoped to the e2e-scenario-1 job; the other manual install blocks are
pre-existing and out of scope for this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mbjorkqvist and others added 2 commits May 18, 2026 13:19
stable_height is unconditionally encoded as a gauge in the /metrics output
(see canister/src/api/metrics.rs), so every step in get_stable_height must
succeed: the http_request query, the candid decode, the UTF-8 conversion, and
locating + parsing the metric line. The previous .ok()? chain silently mapped
genuine infrastructure failures to "value not at target yet", which would have
surfaced as an opaque "timed out waiting for stable height" timeout rather than
a precise panic.

Replace .ok()? with .expect(...) at each step and change the return type from
Option<u32> to u32. tick_until_stable_height simplifies to a direct numeric
comparison.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mbjorkqvist mbjorkqvist requested a review from gregorydemay May 18, 2026 13:45
Comment thread e2e-tests/scenario-1/tests/tests.rs Outdated
The source canister stands in for the bitcoin adapter, which in production is a
node-level service co-located with the bitcoin canister rather than a separate
canister on another subnet. Putting it on an application subnet introduced
topology that does not exist in production; install both on the bitcoin subnet
instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mbjorkqvist mbjorkqvist merged commit a2c246e into master May 19, 2026
36 checks passed
@mbjorkqvist mbjorkqvist deleted the mathias/DEFI-2100-migrate-e2e-scenario-1-test-to-pocketic branch May 19, 2026 05:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants