test: DEFI-2100: Migrate scenario-1 e2e test from dfx to PocketIC#524
Merged
mbjorkqvist merged 25 commits intoMay 19, 2026
Merged
Conversation
candid is already listed under [dependencies] so the entry under [dev-dependencies] is a no-op. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Retaining the block-source canister's principal in the Setup struct costs nothing and avoids having to refactor if a future test needs to inspect or interact with that canister. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously a 500 from the metrics endpoint caused get_stable_height to silently return None, making tick_until_stable_height time out with a generic message. Assert 200 explicitly so the canister's error text is surfaced immediately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The metric is encoded via encode_gauge which takes an f64. The current ic-metrics-encoder serialises 3.0 as "3", so parse::<u32>() works today, but parsing as f64 first makes the code robust against any future encoder that emits "3.0". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
500 ticks for main chain height and 200 for stable height were magic numbers. Add comments explaining why each ceiling is generous enough in practice. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous job invoked the deleted scenario-1.sh script. Mirror the e2e-cdk-bitcoin-canister job: install PocketIC, download the ic-btc-canister WASM artifact, build the scenario-1 source canister, and run `cargo test -p scenario-1` with the matching env vars. Exclude scenario-1 from cargo-tests so it is not run twice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The field was stored but only used during install_canister; nothing reads it back. Removing it eliminates the #[allow(dead_code)] escape hatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The integration test redeclared ADDRESS_1/2/5 alongside the canister's own const definitions, which would silently drift if the addresses ever changed. Add a small src/lib.rs that exposes them as pub const so both main.rs and the integration test import from the same source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bitcoin canister never invokes the IC Bitcoin adapter system API in this test — it only makes inter-canister calls to its configured blocks_source. with_bitcoin_subnet() implied an adapter dependency the test does not have; with_application_subnet() reflects the actual topology and matches how the canister is used here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
If labels are ever added to the metric, the prefix match would silently miss the line and the tick loop would time out with a confusing error rather than report the format change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous .ok().and_then(...) chain silently swallowed both transport and decode errors, so a renamed method or schema change would manifest as a timeout rather than report the underlying failure. Delegating to get_blockchain_info means call/decode failures panic immediately with a useful message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This is what rustfmt would produce for the import block — move serde_bytes after scenario_1 so the order matches the rest of the workspace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make it explicit that these calls match the original scenario-1.sh intent of exercising the endpoint for profiling, so a future reader doesn't add an unrelated assertion thinking the check was forgotten. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
There was a problem hiding this comment.
Pull request overview
This PR migrates the scenario-1 end-to-end test from a dfx-driven shell script to a Rust integration test powered by PocketIC, improving test speed and making assertions type-safe via ic-btc-interface.
Changes:
- Replaced
e2e-tests/scenario-1.shwith a PocketIC-based Rust integration test that reproduces the prior script’s assertions (balances/UTXOs, query-vs-update rejection, fee-percentiles smoke calls, and block-header equality). - Moved shared regtest address constants into a new
scenario-1library target so both the canister binary and the integration test share the same definitions. - Updated CI to stop installing/running
dfxfor scenario-1, build the scenario-1 WASM, and runcargo test --release -p scenario-1, while excludingscenario-1from the main workspace test matrix.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| e2e-tests/scenario-1/tests/integration.rs | Adds PocketIC-driven integration test implementing the former shell script’s assertions. |
| e2e-tests/scenario-1/src/main.rs | Switches address constants to shared scenario_1 library exports. |
| e2e-tests/scenario-1/src/lib.rs | Introduces shared address constants for reuse by bin + tests. |
| e2e-tests/scenario-1/Cargo.toml | Adds dev-dependencies needed for PocketIC integration tests. |
| e2e-tests/scenario-1.sh | Removes the legacy dfx-based shell e2e script. |
| Cargo.lock | Updates lockfile for new scenario-1 dev-dependencies. |
| .github/workflows/workflow.yml | Updates CI to run scenario-1 via PocketIC and adjusts workspace test exclusions/job dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
gregorydemay
left a comment
There was a problem hiding this comment.
Thanks a lot @mbjorkqvist for starting this! Just a couple of minor comments but otherwise LGTM!
The bitcoin canister runs on a bitcoin (system) subnet in production. Use `with_bitcoin_subnet()` so the test reflects that; keep the source canister on a separate application subnet since it is a test fixture, not part of the canister under test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two unchecked calls and the "for profiling purposes" comment were inherited from scenario-1.sh, which itself inherited them from PR #50 (Sep 2022). That PR recorded each call's instruction count into e2e-tests/instructions_count.txt as a hand-maintained profiling baseline. That baseline file has since been deleted, and the endpoint is now covered with much higher rigour by the canbench benchmark `bitcoin_get_current_fee_percentiles` (visible in every CI run via canbench_results.yml) and by dedicated unit tests in canister/src/api/fee_percentiles.rs (including caching behaviour). The two unchecked e2e calls produce no signal today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The action handles OS/arch detection, downloads the requested version, and exports POCKET_IC_BIN so subsequent steps pick it up automatically. Pinned to the same SHA used by dfinity/sol-rpc-canister. Scoped to the e2e-scenario-1 job; the other manual install blocks are pre-existing and out of scope for this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stable_height is unconditionally encoded as a gauge in the /metrics output (see canister/src/api/metrics.rs), so every step in get_stable_height must succeed: the http_request query, the candid decode, the UTF-8 conversion, and locating + parsing the metric line. The previous .ok()? chain silently mapped genuine infrastructure failures to "value not at target yet", which would have surfaced as an opaque "timed out waiting for stable height" timeout rather than a precise panic. Replace .ok()? with .expect(...) at each step and change the return type from Option<u32> to u32. tick_until_stable_height simplifies to a direct numeric comparison. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gregorydemay
approved these changes
May 18, 2026
The source canister stands in for the bitcoin adapter, which in production is a node-level service co-located with the bitcoin canister rather than a separate canister on another subnet. Putting it on an application subnet introduced topology that does not exist in production; install both on the bitcoin subnet instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replace the
dfx-basede2e-tests/scenario-1.shshell script with a Rust integration test driven by PocketIC.Why. The shell test was slow (full
dfx startper run), brittle (assertions done by substring-matching Candid text output), and excluded from typed checks. Moving to PocketIC gives us a faster local feedback loop and type-safe interactions with the canister viaic-btc-interface.What changes.
e2e-tests/scenario-1/tests/integration.rsreproduces every assertion from the deleted shell script: balance & UTXO queries on three addresses, the 1000-UTXO response cap, replicated-mode rejection of*_querymethods, fee-percentile smoke calls, and byte-exact comparison of all six block headers.main.rsinto a newe2e-tests/scenario-1/src/lib.rsso the canister binary and the integration test share one definition.e2e-scenario-1CI job dropsdfxanddfinity/setup-dfx, installs PocketIC, builds the source canister WASM withscripts/build-canister.sh scenario-1, and runscargo test --release -p scenario-1. WASM paths are passed viaIC_BTC_CANISTER_WASM_PATHandE2E_SCENARIO_1_WASM_PATHenv vars; theic-btc-canisterWASM still comes from thecanister-build-reproducibilityartifact.scenario-1is excluded from the maincargo-testsmatrix (it needs the built WASM artifacts, which only the dedicated job has) and from itsneeds:list (no longer dependent oncargo-build).stable_heightfrom the canister's/metricsHTTP endpoint, matching the original script's approach rather than blindly ticking. The value is parsed asf64so a future encoder change to"3.0"doesn't break the test, and bothstable_height Nandstable_height{...} Nline forms are accepted in case labels are added.PocketIcBuilder::with_application_subnet()so behaviour reflects production configuration.