Separate state retention settings from consensus data by jbearer · Pull Request #4279 · EspressoSystems/espresso-network

jbearer · 2026-05-08T22:02:43Z

This allows state to be pruned more aggressively than consensus data. This may be desired because state takes up a lot of space, archival state is rarely accessed, and old state can always be re-derived from headers. Thus, pruning state aggressively may enable a node to keep more history of consensus data while using less space.

The logic for pruning is substantially reworked to separate (a) choosing which batch to prune based on retention settings and usage from (b) actually pruning a batch. This makes it much simpler to implement a complex policy such as having different retention policies for state vs consensus data.

gemini-code-assist

Code Review

This pull request introduces separate retention policies and pruning logic for Merklized state and consensus data, adding new configuration options and updating storage traits to support state-specific pruned height tracking. The SQL storage pruner has been refactored to handle these categories independently, with state pruning prioritized. Review feedback points out an inconsistency in timestamp usage during pruner initialization and suggests enhancing error messages by including the data category to improve debuggability.

jbearer · 2026-05-11T20:54:41Z

+                })?;

-        if pruned_height.is_some_and(|h| height <= h as usize) {
+        if pruned_height.is_some_and(|h| created <= h as i64) {


Unless I'm misunderstanding, this was a latent bug. We were considering the requested state snapshot to exist as long as height (the height of the most recently created snapshot) was not pruned. This would be a false positive whenever we were querying for an older snapshot that had been pruned.

Ultimately this ends up being ok, because if we load a path that is not actually present, we will end up computing a different commitment than the actual root of the tree, but it leads to an unintuitive edge case in the get_path verification logic and a confusing error message.

claude · 2026-05-11T20:59:50Z

Claude finished @jbearer's task in 3m 48s —— View job

Review of PR #4279: Separate state retention settings from consensus data

Gather context (existing comments, diff)
Review changed files in detail
Post inline comments for specific issues
Post summary review

Overall Assessment

This is a well-structured refactor that cleanly separates state pruning from consensus data pruning. The decomposition of the old monolithic prune() into PruneState / Pruner / PruneCategory / prune_batch() is a significant improvement in readability and maintainability. The priority ordering (always prune state before the corresponding consensus data) correctly respects the dependency between the two.

The new test test_separate_state_data_pruning validates the key scenario well — aggressive state pruning while preserving consensus data.

Issues Found

Bug: Archive mode doesn't clear state pruned height

In hotshot-query-service/src/data_source/storage/sql.rs:745, archive mode resets the pruned height to allow refetching of pruned data:

query("DELETE FROM pruned_height WHERE id = 1")
    .execute(conn.as_mut())
    .await?;

This only clears id = 1 (consensus PRUNED_HEIGHT_ID), but doesn't clear id = 2 (STATE_PRUNED_HEIGHT_ID). An archive node that previously had state pruned would still report that state as pruned and wouldn't reconstruct it.

This line isn't in the diff so I couldn't leave an inline comment — suggest changing to:

query("DELETE FROM pruned_height")
    .execute(conn.as_mut())
    .await?;

Fix this →

Minor: Stale comment in `save_state_pruned_height`

See inline comment — the comment says "id is set to 1" but the actual id is 2.

Suggestion: Richer tracing in `prune_batch`

See inline comment — including category and target height would improve production debugging.

Design Observations (non-blocking)

Shared batch_size: State and consensus data share a single batch_size configuration. This is fine for now, but if state tables are significantly larger (as motivated in the PR description), it might be worth having an independent batch size for state in the future.
Snapshot::Commit path and pruned state: In queries/state.rs, the Snapshot::Commit variant looks up a height from the header table (which may not be pruned) and then checks against state_pruned_height. This is correct — even if the header exists, the state nodes may have been pruned independently. The bug fix changing from height to created in the pruned check is also correct and well-explained in the self-review comment.
Defaults match: The new state_minimum_retention and state_target_retention default to the same values as their consensus counterparts (1 day and 7 days), making this a backwards-compatible, opt-in feature.

This allows state to be pruned more aggressively than consensus data. This may be desired because state takes up a lot of space, archival state is rarely accessed, and old state can always be re-derived from headers. Thus, pruning state aggressively may enable a node to keep more history of consensus data while using less space. The logic for pruning is substantially reworked to separate (a) choosing which batch to prune based on retention settings and usage from (b) actually pruning a batch. This makes it much simpler to implement a complex policy such as having different retention policies for state vs consensus data.

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Comment thread hotshot-query-service/src/data_source/storage/sql.rs

Comment thread hotshot-query-service/src/data_source/storage/sql.rs Outdated

Comment thread hotshot-query-service/src/data_source/storage/sql.rs Outdated

jbearer force-pushed the jb/state-pruning branch from d9f3c11 to fad3ee7 Compare May 11, 2026 16:41

jbearer commented May 11, 2026

View reviewed changes

jbearer marked this pull request as ready for review May 11, 2026 20:59

jbearer requested review from bfish713, imabdulbasit, lukaszrzasik, ss-es and sveitser as code owners May 11, 2026 20:59

claude Bot reviewed May 11, 2026

View reviewed changes

Comment thread hotshot-query-service/src/data_source/storage/sql/transaction.rs Outdated

claude Bot reviewed May 11, 2026

View reviewed changes

Comment thread hotshot-query-service/src/data_source/storage/sql.rs

jbearer added 5 commits May 12, 2026 10:30

Better error messages

1c7f934

Extra tracing

311fee3

Add test for state-only pruning settings

faf1b14

Fix handling of state pruned height

7d0b1f8

jbearer force-pushed the jb/state-pruning branch from 4ec58d1 to 7d0b1f8 Compare May 12, 2026 17:30

imabdulbasit approved these changes May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate state retention settings from consensus data#4279

Separate state retention settings from consensus data#4279
jbearer wants to merge 5 commits into
mainfrom
jb/state-pruning

jbearer commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbearer May 11, 2026

Uh oh!

claude Bot commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jbearer commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbearer May 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review of PR #4279: Separate state retention settings from consensus data

Overall Assessment

Issues Found

Bug: Archive mode doesn't clear state pruned height

Minor: Stale comment in save_state_pruned_height

Suggestion: Richer tracing in prune_batch

Design Observations (non-blocking)

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude Bot commented May 11, 2026 •

edited

Loading

Minor: Stale comment in `save_state_pruned_height`

Suggestion: Richer tracing in `prune_batch`