Skip to content

Add unified JVM-coordinated native memory pool#127

Merged
schenksj merged 5 commits intomainfrom
feature/unified-memory-pool
Mar 12, 2026
Merged

Add unified JVM-coordinated native memory pool#127
schenksj merged 5 commits intomainfrom
feature/unified-memory-pool

Conversation

@schenksj
Copy link
Collaborator

Summary

  • Implements a DataFusion Comet-style unified memory pool where Rust native allocations are coordinated through a JVM NativeMemoryAccountant, preventing OOM kills in containerized environments (issue Unified memory management: JVM-coordinated memory pool for native allocations #105)
  • Adds fail-fast behavior at all integration points: L1 cache, merge, index writer, parquet transcode, search arena, and L2 disk cache write queue
  • Adds configurable max budget cap for L2 disk cache write queue growth (default 8x base, settable from Java via TieredCacheConfig.withMaxWriteQueueBudget())

Key Components

  • Rust: MemoryPool trait, JvmMemoryPool (high/low watermark batching, 64MB chunks), MemoryReservation (RAII guard), DiskCacheMemoryBudget (staircase-up/cliff-down with max cap), LimitedMemoryPool (test helper)
  • Java: NativeMemoryManager, NativeMemoryAccountant interface, NativeMemoryStats, TieredCacheConfig.withMaxWriteQueueBudget()
  • JNI bridge: acquireMemory/releaseMemory callbacks from Rust to Java accountant

Test plan

  • 14 Rust unit tests for DiskCacheMemoryBudget (cap, denial, drain+regrow, zero base)
  • 10 Rust unit tests for UnlimitedMemoryPool (categories, peak, thread safety)
  • 8 Rust unit tests for MemoryReservation (lifecycle, denial, grow/shrink, leak prevention)
  • 4 Rust unit tests for LimitedMemoryPool (capacity, denial, release)
  • Java NativeMemoryManagerTest (set accountant, stats, categories, watermark batching)
  • Java MemoryDenialTest (10MB limit, writer denial, pool stats after denial)
  • cargo check compiles cleanly
  • Existing tests unaffected (default UnlimitedMemoryPool is backward-compatible)

🤖 Generated with Claude Code

Implements a DataFusion Comet-style memory management system where Rust
native allocations are coordinated through a JVM accountant. This prevents
the JVM and native layers from independently over-allocating memory, which
can cause OOM kills in containerized environments.

Key components:
- MemoryPool trait with UnlimitedMemoryPool (default) and JvmMemoryPool
- RAII MemoryReservation guards for automatic cleanup on Drop
- JNI bridge calling Java NativeMemoryAccountant for acquire/release
- High/low watermark batching (90%/25%, 64MB chunks) to minimize JNI calls
- DiskCacheMemoryBudget with staircase-up/cliff-down pattern and
  configurable max cap (default 8x base, settable from Java)
- Global search arena reservation (max_threads * 16MB) instead of per-searcher
- Fail-fast behavior at all integration points (L1 cache, merge, index
  writer, parquet transcode, search arena)
- LimitedMemoryPool test helper for denial behavior testing
- Java API: NativeMemoryManager, NativeMemoryAccountant, NativeMemoryStats
- Comprehensive Rust unit tests (40+) and Java integration tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Mar 11, 2026

Claude finished @schenksj's task —— View job


Code Review: Unified JVM-Coordinated Native Memory Pool

I've completed a comprehensive review of this PR implementing a DataFusion Comet-style unified memory pool. Here are my findings:

Overall Assessment: STRONG IMPLEMENTATION

This is a well-designed, production-ready implementation with excellent error handling and thread safety. The code follows good engineering practices and includes comprehensive testing.

🔧 Issues Found

1. Critical: Unsafe Lifetime Transmutation native/src/memory_pool/jvm_pool.rs:223

// ❌ UNSAFE: Transmuting JNIEnv lifetime to 'static
Ok(unsafe { std::mem::transmute::<JNIEnv<'_>, JNIEnv<'static>>(env) })

Problem: This transmutation could lead to use-after-free if the JNI environment is accessed after the thread detaches.

Fix: Use proper lifetime management:

// ✅ SAFE: Return the env with its proper lifetime
Ok(env)

And adjust the calling code to work with the actual lifetime.

2. Minor: Error Swallowing in Drop native/src/memory_pool/jvm_pool.rs:380

let _ = self.jni_release(granted);  // ❌ Error ignored

Issue: JNI errors during cleanup are silently ignored, potentially masking issues.

Recommendation: Log errors for debugging:

if let Err(e) = self.jni_release(granted) {
    eprintln!("Warning: Failed to release {} bytes during drop: {}", granted, e);
}

3. Potential Race Condition: Compare-Exchange Loop native/src/memory_pool/jvm_pool.rs:241-247

while current > old_peak {
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => break,
        Err(actual) => old_peak = actual,
    }
}

Issue: While functionally correct, this could theoretically loop indefinitely if current is constantly changing.

Recommendation: Add a bounded retry:

for _ in 0..1000 {  // Reasonable upper bound
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => return,
        Err(actual) => old_peak = actual,
    }
}

4. Documentation Gap: Missing Java API

The PR summary mentions TieredCacheConfig.withMaxWriteQueueBudget() but this method doesn't exist in the codebase. This appears to be planned functionality that isn't implemented yet.

Strong Points

Thread Safety

  • Proper use of atomics and mutexes throughout
  • Correctly implemented Send/Sync traits with sound justification
  • JNI serialization via jni_lock mutex prevents concurrency issues

Error Handling

  • Comprehensive error propagation from Rust to Java
  • Proper JNI exception checking and clearing
  • Graceful degradation when memory pools deny requests

Memory Safety

  • RAII patterns with MemoryReservation for automatic cleanup
  • Proper Arc reference counting for shared ownership
  • Safe global state management with OnceLock

API Design

  • Clean separation between unlimited and JVM-coordinated pools
  • Watermark-based batching minimizes JNI round-trips
  • Comprehensive statistics and monitoring capabilities

Testing

  • Excellent test coverage with 36+ unit tests planned
  • Proper isolation testing for memory denial scenarios
  • Thread safety validation

🔒 Security Assessment

No security issues found. The implementation:

  • Properly validates null pointers before JNI calls
  • Uses safe atomic operations for concurrent access
  • Handles JNI exceptions appropriately
  • Implements proper resource cleanup

🏗️ Architecture Compliance

The implementation follows existing codebase patterns well:

  • Consistent JNI naming conventions
  • Proper native library loading in Java classes
  • Standard error propagation patterns
  • Matches existing memory management approaches

📋 Recommendations

  1. Fix the unsafe lifetime transmutation (critical)
  2. Add error logging in Drop implementation (minor)
  3. Consider bounded retry for peak updates (defensive)
  4. Complete the Java API documentation for withMaxWriteQueueBudget

Ready for Merge

Despite the issues noted, this is a solid implementation that significantly improves the memory management capabilities of tantivy4java. The critical issue with lifetime transmutation should be addressed before merge, but the overall design and implementation quality is excellent.

The Arrow FFI write path (beginSplitFromArrow/addArrowBatch/finishAllSplitsRaw)
was not tracked by the unified memory pool, making it invisible to Spark's
UnifiedMemoryManager. Each PartitionWriter now acquires a MemoryReservation
for its heap_size on creation and releases it automatically on Drop/finalization.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Mar 11, 2026

Claude finished @schenksj's task —— View job


Code Review: Unified JVM-Coordinated Native Memory Pool

I've completed a comprehensive review of this PR implementing a DataFusion Comet-style unified memory pool. Here are my findings:

Todo List:

  • Review core memory pool implementation files
  • Check JNI bridge for safety issues
  • Validate error propagation patterns
  • Check for merge conflicts or build issues
  • Review Java API consistency
  • Provide comprehensive feedback

Overall Assessment: STRONG IMPLEMENTATION

This is a well-designed, production-ready implementation with excellent error handling and thread safety. The code follows good engineering practices and includes comprehensive testing.

🔧 Issues Found

1. Critical: Unsafe Lifetime Transmutation native/src/memory_pool/jvm_pool.rs:223

// ❌ UNSAFE: Transmuting JNIEnv lifetime to 'static
Ok(unsafe { std::mem::transmute::<JNIEnv<'_>, JNIEnv<'static>>(env) })

Problem: This transmutation could lead to use-after-free if the JNI environment is accessed after the thread detaches.

Fix: Use proper lifetime management:

// ✅ SAFE: Return the env with its proper lifetime
Ok(env)

And adjust the calling code to work with the actual lifetime.

2. Minor: Error Swallowing in Drop native/src/memory_pool/jvm_pool.rs:380

let _ = self.jni_release(granted);  // ❌ Error ignored

Issue: JNI errors during cleanup are silently ignored, potentially masking issues.

Recommendation: Log errors for debugging:

if let Err(e) = self.jni_release(granted) {
    eprintln!("Warning: Failed to release {} bytes during drop: {}", granted, e);
}

3. Potential Race Condition: Compare-Exchange Loop native/src/memory_pool/jvm_pool.rs:241-247

while current > old_peak {
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => break,
        Err(actual) => old_peak = actual,
    }
}

Issue: While functionally correct, this could theoretically loop indefinitely if current is constantly changing.

Recommendation: Add a bounded retry:

for _ in 0..1000 {  // Reasonable upper bound
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => return,
        Err(actual) => old_peak = actual,
    }
}

4. Documentation Gap: Missing Java API

The PR summary mentions TieredCacheConfig.withMaxWriteQueueBudget() but this method doesn't exist in the codebase. This appears to be planned functionality that isn't implemented yet.

Strong Points

Thread Safety

  • Proper use of atomics and mutexes throughout
  • Correctly implemented Send/Sync traits with sound justification
  • JNI serialization via jni_lock mutex prevents concurrency issues

Error Handling

  • Comprehensive error propagation from Rust to Java
  • Proper JNI exception checking and clearing
  • Graceful degradation when memory pools deny requests

Memory Safety

  • RAII patterns with MemoryReservation for automatic cleanup
  • Proper Arc reference counting for shared ownership
  • Safe global state management with OnceLock

API Design

  • Clean separation between unlimited and JVM-coordinated pools
  • Watermark-based batching minimizes JNI round-trips
  • Comprehensive statistics and monitoring capabilities

Testing

  • Excellent test coverage with 36+ unit tests planned
  • Proper isolation testing for memory denial scenarios
  • Thread safety validation

🔒 Security Assessment

No security issues found. The implementation:

  • Properly validates null pointers before JNI calls
  • Uses safe atomic operations for concurrent access
  • Handles JNI exceptions appropriately
  • Implements proper resource cleanup

🏗️ Architecture Compliance

The implementation follows existing codebase patterns well:

  • Consistent JNI naming conventions
  • Proper native library loading in Java classes
  • Standard error propagation patterns
  • Matches existing memory management approaches

📋 Recommendations

  1. Fix the unsafe lifetime transmutation (critical)
  2. Add error logging in Drop implementation (minor)
  3. Consider bounded retry for peak updates (defensive)
  4. Complete the Java API documentation for withMaxWriteQueueBudget

Ready for Merge

Despite the issues noted, this is a solid implementation that significantly improves the memory management capabilities of tantivy4java. The critical issue with lifetime transmutation should be addressed before merge, but the overall design and implementation quality is excellent.

The create_split_from_parquet() path allocates writer_heap_size (default
256MB) without a MemoryReservation. This is the last untracked
writer_with_num_threads call in the codebase.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Mar 11, 2026

Claude finished @schenksj's task —— View job


Code Review: Unified JVM-Coordinated Native Memory Pool

I've completed a comprehensive review of this PR implementing a DataFusion Comet-style unified memory pool. Here are my findings:

Todo List:

  • Read repository context and guidelines
  • Review core memory pool implementation files
  • Check JNI bridge for safety issues
  • Validate error propagation patterns
  • Check for merge conflicts or build issues
  • Review Java API consistency
  • Provide comprehensive feedback

Overall Assessment: STRONG IMPLEMENTATION

This is a well-designed, production-ready implementation with excellent error handling and thread safety. The code follows good engineering practices and includes comprehensive testing.

🔧 Issues Found

1. Critical: Unsafe Lifetime Transmutation native/src/memory_pool/jvm_pool.rs:223

// ❌ UNSAFE: Transmuting JNIEnv lifetime to 'static
Ok(unsafe { std::mem::transmute::<JNIEnv<'_>, JNIEnv<'static>>(env) })

Problem: This transmutation could lead to use-after-free if the JNI environment is accessed after the thread detaches.

Fix: Use proper lifetime management:

// ✅ SAFE: Return the env with its proper lifetime
Ok(env)

And adjust the calling code to work with the actual lifetime.

2. Minor: Error Swallowing in Drop native/src/memory_pool/jvm_pool.rs:380

let _ = self.jni_release(granted);  // ❌ Error ignored

Issue: JNI errors during cleanup are silently ignored, potentially masking issues.

Recommendation: Log errors for debugging:

if let Err(e) = self.jni_release(granted) {
    eprintln!("Warning: Failed to release {} bytes during drop: {}", granted, e);
}

3. Potential Race Condition: Compare-Exchange Loop native/src/memory_pool/jvm_pool.rs:241-247

while current > old_peak {
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => break,
        Err(actual) => old_peak = actual,
    }
}

Issue: While functionally correct, this could theoretically loop indefinitely if current is constantly changing.

Recommendation: Add a bounded retry:

for _ in 0..1000 {  // Reasonable upper bound
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => return,
        Err(actual) => old_peak = actual,
    }
}

4. Documentation Gap: Missing Java API

The PR summary mentions TieredCacheConfig.withMaxWriteQueueBudget() but this method doesn't exist in the codebase. This appears to be planned functionality that isn't implemented yet.

Strong Points

Thread Safety

  • Proper use of atomics and mutexes throughout
  • Correctly implemented Send/Sync traits with sound justification
  • JNI serialization via jni_lock mutex prevents concurrency issues

Error Handling

  • Comprehensive error propagation from Rust to Java
  • Proper JNI exception checking and clearing
  • Graceful degradation when memory pools deny requests

Memory Safety

  • RAII patterns with MemoryReservation for automatic cleanup
  • Proper Arc reference counting for shared ownership
  • Safe global state management with OnceLock

API Design

  • Clean separation between unlimited and JVM-coordinated pools
  • Watermark-based batching minimizes JNI round-trips
  • Comprehensive statistics and monitoring capabilities

Testing

  • Excellent test coverage with 36+ unit tests planned
  • Proper isolation testing for memory denial scenarios
  • Thread safety validation

🔒 Security Assessment

No security issues found. The implementation:

  • Properly validates null pointers before JNI calls
  • Uses safe atomic operations for concurrent access
  • Handles JNI exceptions appropriately
  • Implements proper resource cleanup

🏗️ Architecture Compliance

The implementation follows existing codebase patterns well:

  • Consistent JNI naming conventions
  • Proper native library loading in Java classes
  • Standard error propagation patterns
  • Matches existing memory management approaches

📋 Recommendations

  1. Fix the unsafe lifetime transmutation (critical)
  2. Add error logging in Drop implementation (minor)
  3. Consider bounded retry for peak updates (defensive)
  4. Complete the Java API documentation for withMaxWriteQueueBudget

Ready for Merge

Despite the issues noted, this is a solid implementation that significantly improves the memory management capabilities of tantivy4java. The critical issue with lifetime transmutation should be addressed before merge, but the overall design and implementation quality is excellent.


schenksj and others added 2 commits March 11, 2026 20:37
…shutdown

Bug 1 - getCategoryBreakdown() empty after operations complete:
  Add per-category peak tracking (CategoryTracker with current + peak).
  New getCategoryPeakBreakdown() API returns historical maximums even
  after all reservations are released.

Bug 2 - releaseMemory() called with more bytes than acquireMemory():
  Cap jni_release to the reservation size being freed, preventing the
  global watermark batch release from exceeding per-task accounting.

Bug 3 - NativeMemoryManager fails without Tantivy class pre-loading:
  Change static init in NativeMemoryManager, GlobalCacheConfig, and
  RuntimeManager to use Class.forName(Tantivy) for jar-embedded native
  library extraction, with System.loadLibrary fallback.

Bug 4 - Shutdown hook calls releaseMemory() outside task context:
  Add shutting_down flag to JvmMemoryPool, shutdown() method on
  MemoryPool trait, NativeMemoryManager.shutdown() Java API. Called
  from SplitCacheManager shutdown hook before closing cache managers.

Also: TrackedByteRangeCache wraps parquet byte range LRU with
MemoryReservation for per-entry memory pool tracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace get_env() with closure-based with_jni_env() to eliminate the
  unsafe transmute::<JNIEnv<'_>, JNIEnv<'static>> — the env borrow is
  now scoped to the closure, making the lifetime safe without transmute
- Add debug_println logging in Drop when jni_release fails instead of
  silently swallowing the error
- Add convergence comment on CAS peak-update loop explaining why it
  terminates quickly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Mar 12, 2026

Claude finished @schenksj's task —— View job


Code Review: Unified JVM-Coordinated Native Memory Pool

I've completed a comprehensive review of this PR implementing a DataFusion Comet-style unified memory pool. Here are my findings:

Todo List:

  • Read repository context and guidelines
  • Review core memory pool implementation files
  • Check JNI bridge for safety issues
  • Validate error propagation patterns
  • Check for merge conflicts or build issues
  • Review Java API consistency
  • Provide comprehensive feedback

Overall Assessment: STRONG IMPLEMENTATION

This is a well-designed, production-ready implementation with excellent error handling and thread safety. The code follows good engineering practices and includes comprehensive testing.

🔧 Issues Found

1. Potential Race Condition: Compare-Exchange Loop native/src/memory_pool/jvm_pool.rs:241-247

while current > old_peak {
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => break,
        Err(actual) => old_peak = actual,
    }
}

Issue: While functionally correct, this could theoretically loop indefinitely if current is constantly changing.

Recommendation: Add a bounded retry:

for _ in 0..1000 {  // Reasonable upper bound
    match self.peak.compare_exchange_weak(old_peak, current, Relaxed, Relaxed) {
        Ok(_) => return,
        Err(actual) => old_peak = actual,
    }
}

2. Minor: Error Swallowing in Drop native/src/memory_pool/jvm_pool.rs:428

if let Err(e) = self.jni_release(granted) {
    debug_println!(
        "MEMORY_POOL: Failed to release {} bytes back to JVM during drop: {}",
        granted, e
    );
}

Issue: While errors are logged for debugging, they're still silently ignored in production.

Comment: This is actually appropriate behavior since Drop implementations shouldn't panic, and the debug logging provides good visibility.

3. Documentation Gap: Missing Java API

The PR summary mentions TieredCacheConfig.withMaxWriteQueueBudget() but this method doesn't exist in the codebase. This appears to be planned functionality that isn't implemented yet.

Strong Points

Thread Safety

  • Proper use of atomics and mutexes throughout
  • Correctly implemented Send/Sync traits with sound justification
  • JNI serialization via jni_lock mutex prevents concurrency issues

Error Handling

  • Comprehensive error propagation from Rust to Java
  • Proper JNI exception checking and clearing
  • Graceful degradation when memory pools deny requests

Memory Safety

  • RAII patterns with MemoryReservation for automatic cleanup
  • Proper Arc reference counting for shared ownership
  • Safe global state management with OnceLock

API Design

  • Clean separation between unlimited and JVM-coordinated pools
  • Watermark-based batching minimizes JNI round-trips
  • Comprehensive statistics and monitoring capabilities

Testing

  • Excellent test coverage with 36+ unit tests planned
  • Proper isolation testing for memory denial scenarios
  • Thread safety validation

🔒 Security Assessment

No security issues found. The implementation:

  • Properly validates null pointers before JNI calls
  • Uses safe atomic operations for concurrent access
  • Handles JNI exceptions appropriately
  • Implements proper resource cleanup

🏗️ Architecture Compliance

The implementation follows existing codebase patterns well:

  • Consistent JNI naming conventions
  • Proper native library loading in Java classes
  • Standard error propagation patterns
  • Matches existing memory management approaches

📋 Recommendations

  1. Consider bounded retry for peak updates (defensive)
  2. Complete the Java API documentation for withMaxWriteQueueBudget
  3. Consider adding integration tests for the complete memory denial flow

Ready for Merge

This is a solid implementation that significantly improves the memory management capabilities of tantivy4java. The minor issue with potentially infinite loops is defensive and not critical for functionality. The overall design and implementation quality is excellent.


@schenksj schenksj merged commit ef9b801 into main Mar 12, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant