forked from firedancer-io/radiance
-
Notifications
You must be signed in to change notification settings - Fork 25
feat: remove runtime manifest dependency + streaming rewards #197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
7layermagik
wants to merge
12
commits into
dev
Choose a base branch
from
remove-manifest-runtime-dependency
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Replay now reads all seed data from state file instead of manifest at runtime. This eliminates the need for the manifest file after AccountsDB is built. Changes: - Add manifest_* fields to MithrilState schema (v2) - Create PopulateManifestSeed() to copy manifest data at build time - Update configureInitialBlock/FromResume to use state file - Update newReplayCtx to prefer state file over manifest - Update buildInitialEpochStakesCache to use ManifestEpochStakes - Clear ManifestEpochStakes after first replayed slot past snapshot - Backwards compat: fall back to manifest for old state files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Require state schema version 2 (no v0/v1 migration, error instead) - Remove manifest fallback in configureInitialBlock (fatal if missing) - Remove manifest fallback in buildInitialEpochStakesCache (fatal if missing) - Remove manifest fallback in newReplayCtx (fatal if missing) - Remove snapshotManifest parameter from ReplayBlocks signature - Remove snapshotManifest parameter from configureInitialBlock signature - Add ManifestTransactionCount and ManifestEpochAuthorizedVoters fields - Add proper error handling for corrupted state file decoding - Clean up unused imports (epochstakes, snapshot from replay package) This ensures replay NEVER reads from manifest after AccountsDB build. Old state files will error with "delete AccountsDB and rebuild from snapshot". Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Make ManifestEvictedBlockhash required in strict v2 mode (affects transaction age validation for first block) - Switch manifest_seed.go from mr-tron/base58 to internal pkg/base58 for consistency with block.go decode path Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…e timestamps - EpochAuthorizedVoters: change from map[string]string to map[string][]string to support multiple authorized voters per vote account (matches original manifest behavior where PutEntry appends to a slice) - VoteTimestamps: populate from ALL vote accounts in cache, not just those with non-zero stake (matches original manifest behavior where all vote accounts from Bank.Stakes.VoteAccounts had timestamps populated) Note: This changes the state file schema for manifest_epoch_authorized_voters. Existing state files will require rebuild from snapshot. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Stop populating global.StakeCache at startup. Instead, build an aggregate map of vote pubkey → total stake directly from AccountsDB scan. Changes: - Add voteStakeTotals map + mutex + helpers in global_ctx.go - Modify setupInitialVoteAcctsAndStakeAccts to build aggregate only: - Remove PutStakeCacheItemBulk calls - Each batch worker builds local map, merges into shared under mutex - Call SetVoteStakeTotals() to store aggregate for later use - Full stake cache no longer populated at startup (memory savings) Note: Epoch boundary functions still use StakeCache and will fail until Step 2 adds on-demand cache rebuild. This is expected. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace full stake cache with streaming from AccountsDB for epoch boundary processing and rewards calculation. Key changes: - Add StreamStakeAccounts() for parallel stake account streaming - Add spool file infrastructure (SpoolWriter/SpoolReader) for binary serialization of stake rewards during calculation - Modify updateStakeHistorySysvar() to stream from AccountsDB - Modify updateEpochStakesAndRefreshVoteCache() to stream in single pass - Add CalculateRewardsStreaming() for two-pass streaming rewards: - Pass 1: Calculate total points - Pass 2: Calculate rewards and write to spool file - Add DistributeStakingRewardsFromSpool() for partition distribution - Update recordStakeDelegation() to only track new pubkeys for index Architecture: - No full stake cache held in memory - All stake data streamed directly from AccountsDB - Rewards written to binary spool file during calculation - Distribution reads from spool file partition-by-partition - RAM stays flat across epoch boundaries Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Addresses issues identified in ChatGPT review of ce71c54: 1. Per-partition spool files (O(N) → O(1) reads per partition) - Write to reward_spool_<slot>_p<partition>.bin during calculation - Sequential read per partition during distribution (no indexing) - Eliminates O(N × partitions) file scanning 2. Channel-based single writer for error capture - All spool writes routed through one goroutine - Write errors captured in atomic.Value and propagated - Cleanup on failure 3. Silent account error tracking - Track failed account reads/unmarshals with atomic counters - Log warning with count and first error when failures occur 4. stakePointsCache elimination (already in ce71c54) - Two-pass streaming: Pass 1 for total points, Pass 2 recomputes + writes - Trades ~2x CPU for 0 extra RAM (~140MB saved) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Clean PartitionedRewardDistributionInfo: remove unused Credits, RewardPartitions, StakingRewards, WorkerPool fields - Remove dead code: rewardDistributionTask, rewardDistributionWorker, InitWorkerPool, ReleaseWorkerPool, DistributeStakingRewardsForPartition - DistributeStakingRewardsFromSpool now streams with reader.Next() loop instead of ReadAll() for flat RAM during distribution - Use dynamic slices with append (no pre-allocated arrays with nils) - Implement strict error policy: any account read/unmarshal/marshal failure returns error immediately for consensus correctness - Add buffered I/O (1MB) to PartitionReader and partitionWriter - Remove all legacy single-file spool types from spool.go Net reduction: ~294 lines removed Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace atomic.Value with atomic.Pointer[error] to avoid potential issues with uninitialized atomic.Value (CompareAndSwap on zero value) - Check workerPool.Invoke() return value and fail fast if pool rejects task (balance wg.Done since worker won't run) - Proper pointer dereference (*werr, *ferr) for error formatting Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Delete pkg/rewards/partitions.go (old in-memory partition tracking) - Delete pkg/rewards/points.go (CalculatedStakePointsAccumulator) - Remove CalculateStakeRewardsAndPartitions, CalculateStakePoints, delegationAndPubkey, idxAndReward from rewards.go - Remove unused rpc import - Fix workerPool.Invoke error handling in StreamStakeAccounts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
StreamStakeAccounts now waits for all already-queued batches to complete before returning an error, preventing workers from running against shared state after the caller has moved on. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The function and its call referenced the removed snapshot.SnapshotManifest parameter, causing build failures. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR eliminates the runtime dependency on the snapshot manifest file for replay and replaces the memory-intensive full stake cache with streaming stake account processing from AccountsDB.
Key Architectural Changes
1. Remove Manifest Runtime Dependency
2. Streaming Rewards (Zero Memory Spike)
3. Stake Pubkey Index
stake_pubkeys.idx) tracks all known stake accountsChanges by Component
State File (pkg/state/)
ManifestEpochStakes,ManifestEpochAuthorizedVoters,ManifestTransactionCount,ManifestEpochAcctsHashfieldsGlobal Context (pkg/global/)
StreamStakeAccounts()for parallel batched streaming from AccountsDBTrackNewStakePubkey()for incremental index updatesvoteStakeTotalsaggregate map (replaces full stake cache at startup)Epoch Processing (pkg/replay/epoch.go)
updateStakeHistorySysvar()→ stream from AccountsDBupdateEpochStakesAndRefreshVoteCache()→ single streaming pass builds both vote totals and effective stakesRewards (pkg/rewards/)
spool.gowith per-partition binary spool files:PartitionedSpoolWritersfor buffered writes (1MB buffer)PartitionReaderfor sequential reads per partitionCalculateRewardsStreaming()for two-pass streaming calculationDistributeStakingRewardsFromSpool()for partition-based distributionSnapshot (pkg/snapshot/)
manifest_seed.gofor extracting epoch stakes from manifest at parse timePerformance Characteristics
File Changes
🤖 Generated with Claude Code