feat: Add production-ready compute unit benchmarking framework with statistical analysis by levicook · Pull Request #5 · levicook/litesvm-testing

levicook · 2025-06-11T05:36:51Z

Dual Benchmarking Paradigms

🔬 Instruction Benchmarking - Pure CU measurement

// Measures exactly what you ask for - no hidden overhead
let result = benchmark_instruction(sol_transfer_bench, 100);
// Result: 150 CU (0% variance) - perfectly consistent

🔄 Transaction Benchmarking - Complete workflow analysis

// Real-world multi-program scenarios
let result = benchmark_transaction(token_setup_bench, 100);
// Result: 28,322-38,822 CU range - realistic variance

Statistical Analysis Engine

Percentile-Based Estimates (inspired by Helius Priority Fee API):

{
  "cu_estimate": {
    "min": 28322,           // 0th percentile - absolute minimum
    "conservative": 30145,  // 25th percentile - safe for most cases
    "balanced": 32891,      // 50th percentile - good default  
    "safe": 35123,         // 75th percentile - high reliability
    "very_high": 37456,    // 95th percentile - very reliable
    "unsafe_max": 38822,   // 100th percentile - maximum observed
    "sample_size": 100
  }
}

🔧 Major Technical Achievements

1. Context Discovery System

Two-phase measurement: Simulation for context + execution for statistics
Rich execution context: SVM state, program details, CPI analysis
Address book system: Human-readable program names vs raw pubkeys

2. Statistical Rigor

Fixed percentile calculation bugs that were showing incorrect variance
Comprehensive unit tests (7 test cases covering edge cases)
Proper handling of duplicates, unsorted input, boundary conditions

3. Clean Architecture

Modular design: Separate concerns across focused modules
Type-safe domain modeling: StatType enum for instruction vs transaction distinction
Professional tooling: env_logger integration, clean JSON serialization

4. Framework Design Excellence

No hidden overhead: Removed automatic ComputeBudgetInstruction for transparency
SVM state accumulation: Realistic measurements vs isolated tests
User control: Benchmark authors control SVM configuration completely

📊 Working Examples & Living Documentation

Benchmarks as Primary Documentation

SOL Transfer: cu_bench_sol_transfer_ix.rs - 150 CU pure system program call
SPL Token Transfer: cu_bench_spl_transfer_ix.rs - Complex multi-account instruction
Token Setup Workflow: cu_bench_token_setup_tx.rs - 5-instruction, 4-program workflow

Comprehensive Documentation

📖 Complete Guide: BENCHMARKING.md - 274 lines of practical documentation
🎯 Enhanced README: Repositions project as testing + benchmarking platform
💡 Learning Path: README → BENCHMARKING.md → working benchmark files

🚀 Key Design Decisions & Evolution

Problems Solved During Development

Multiple SVM Issue: Fixed benchmark runner creating different SVM instances
Account Collision Errors: Generate fresh keypairs to avoid conflicts
Insufficient Funds: Increased funding for long measurement runs
Framework Overhead: Removed hidden ComputeBudgetInstruction for transparency
Percentile Calculation Bug: Fixed incorrect indexing showing false consistency
Domain Modeling: Created proper instruction vs transaction distinction

Architecture Choices

Living Documentation: Benchmark files serve as guaranteed-working examples
Statistical Approach: Multiple samples → confidence intervals vs single measurements
User Control: Framework measures exactly what users ask for, nothing more
Professional UX: Quiet by default, rich logging via RUST_LOG=info

📈 Impact & Results

Ecosystem Positioning

Elevates litesvm-testing from "another testing framework" to unique dual-purpose toolkit:

Testing: Comprehensive error assertions and log verification
Benchmarking: Systematic CU analysis with statistical confidence

Concrete Value Delivered

Capability	Before	After
CU Measurement	Manual, ad-hoc	Systematic, statistical
Fee Estimation	Guesswork	Data-driven with confidence intervals
Instruction Analysis	None	Pure measurement without overhead
Transaction Analysis	None	Multi-program workflow insights
Reproducibility	Inconsistent	Professional methodology

Technical Metrics

+1,665 lines, -506 lines across 15 files
324 lines of comprehensive unit tests
274 lines of documentation
3 working benchmarks demonstrating different paradigms

🔄 Migration & Integration

Existing Users: Zero breaking changes - all existing testing functionality preserved

New Capabilities: Opt-in via --features cu_bench for benchmarking functionality

Production Integration:

// Load benchmark results for fee estimation
let cu_estimate = load_benchmark_result("sol_transfer")?.cu_estimate.conservative;
let compute_budget_ix = ComputeBudgetInstruction::set_compute_unit_limit(cu_estimate);

This PR establishes litesvm-testing as the definitive toolkit for Solana program development - combining comprehensive testing utilities with production-ready performance analysis capabilities not available elsewhere in the ecosystem.

Ready for review! 🎯

- Add InstructionBenchmark trait for clean separation of concerns - Benchmark owns: SVM setup, keypairs, signing - Framework owns: unsigned tx building, CU measurement, statistics - Implement benchmark_instruction() runner with SVM state accumulation - Convert SOL and SPL token transfer benchmarks to use new framework - Add solana-message dependency for unsigned transaction creation - Simplify benchmark output to single summary line + JSON data - Eliminate 150+ lines of boilerplate from benchmark implementations - Maintain identical CU measurements (300 CU SOL, 4794 CU SPL) The framework enables measuring any Solana instruction's compute unit usage with minimal code while providing structured estimates similar to Helius Priority Fee API.

Add comprehensive CU benchmarking framework with dual instruction/transaction paradigms: Framework Features: - InstructionBenchmark: Pure instruction CU measurement (no framework overhead) - TransactionBenchmark: Complete workflow measurement with multi-program context - Rich execution context discovery through simulation - Percentile-based CU estimates (min/conservative/balanced/safe/very_high/unsafe_max) - Professional logging with env_logger integration - Clean JSON output with proper domain modeling Key Design Decisions: - Remove automatic ComputeBudgetInstruction from instruction benchmarks for transparency - Two-phase measurement: simulation for context + execution for statistics - SVM state accumulation across measurements for realism - StatType enum for clean instruction vs transaction distinction - Comprehensive unit tests for percentile calculations Benchmarks: - SOL transfer: 150 CU (pure instruction) - SPL token transfer: instruction-level benchmark - Token setup workflow: 28,322-38,822 CU transaction benchmark This provides systematic, reproducible CU analysis for both research and production planning.

Transform project positioning from "testing framework" to "testing and benchmarking framework" with comprehensive documentation: Documentation Additions: - Add BENCHMARKING.md: Complete guide with living examples and best practices - Enhance README: Prominently feature CU benchmarking alongside testing - Create clear learning path: README → BENCHMARKING.md → benchmark files Key Documentation Features: - Dual paradigm explanation (instruction vs transaction benchmarking) - Statistical output interpretation (percentile-based estimates) - Production integration patterns for fee estimation - Troubleshooting guide for common benchmark issues - Living documentation that references actual working benchmark files Project Positioning: - README hero section now highlights both testing AND benchmarking capabilities - CU benchmarking quick start with concrete examples (SOL transfer: 150 CU, Token setup: 28K-38K CU) - Updated roadmap showing completed benchmarking framework - Enhanced examples section showcasing benchmark files as primary documentation This establishes systematic CU analysis as a unique differentiator alongside the existing comprehensive testing utilities.

levicook self-assigned this Jun 11, 2025

levicook added 2 commits June 11, 2025 12:56

levicook changed the title ~~feat: add compute unit benchmarking framework with trait-based design~~ feat: Add comprehensive compute unit benchmarking framework with statistical analysis Jun 11, 2025

levicook changed the title ~~feat: Add comprehensive compute unit benchmarking framework with statistical analysis~~ feat: Add production-ready compute unit benchmarking framework with statistical analysis Jun 11, 2025

levicook marked this pull request as ready for review June 11, 2025 19:13

levicook merged commit d58a041 into main Jun 11, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add production-ready compute unit benchmarking framework with statistical analysis#5

feat: Add production-ready compute unit benchmarking framework with statistical analysis#5
levicook merged 3 commits intomainfrom
feat/cu-bench-framework

levicook commented Jun 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

levicook commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dual Benchmarking Paradigms

Statistical Analysis Engine

🔧 Major Technical Achievements

1. Context Discovery System

2. Statistical Rigor

3. Clean Architecture

4. Framework Design Excellence

📊 Working Examples & Living Documentation

Benchmarks as Primary Documentation

Comprehensive Documentation

🚀 Key Design Decisions & Evolution

Problems Solved During Development

Architecture Choices

📈 Impact & Results

Ecosystem Positioning

Concrete Value Delivered

Technical Metrics

🔄 Migration & Integration

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

levicook commented Jun 11, 2025 •

edited

Loading