Add unified JVM-coordinated native memory pool#127
Conversation
Implements a DataFusion Comet-style memory management system where Rust native allocations are coordinated through a JVM accountant. This prevents the JVM and native layers from independently over-allocating memory, which can cause OOM kills in containerized environments. Key components: - MemoryPool trait with UnlimitedMemoryPool (default) and JvmMemoryPool - RAII MemoryReservation guards for automatic cleanup on Drop - JNI bridge calling Java NativeMemoryAccountant for acquire/release - High/low watermark batching (90%/25%, 64MB chunks) to minimize JNI calls - DiskCacheMemoryBudget with staircase-up/cliff-down pattern and configurable max cap (default 8x base, settable from Java) - Global search arena reservation (max_threads * 16MB) instead of per-searcher - Fail-fast behavior at all integration points (L1 cache, merge, index writer, parquet transcode, search arena) - LimitedMemoryPool test helper for denial behavior testing - Java API: NativeMemoryManager, NativeMemoryAccountant, NativeMemoryStats - Comprehensive Rust unit tests (40+) and Java integration tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Claude finished @schenksj's task —— View job Code Review: Unified JVM-Coordinated Native Memory PoolI've completed a comprehensive review of this PR implementing a DataFusion Comet-style unified memory pool. Here are my findings: ✅ Overall Assessment: STRONG IMPLEMENTATIONThis is a well-designed, production-ready implementation with excellent error handling and thread safety. The code follows good engineering practices and includes comprehensive testing. 🔧 Issues Found1. Critical: Unsafe Lifetime Transmutation
|
The Arrow FFI write path (beginSplitFromArrow/addArrowBatch/finishAllSplitsRaw) was not tracked by the unified memory pool, making it invisible to Spark's UnifiedMemoryManager. Each PartitionWriter now acquires a MemoryReservation for its heap_size on creation and releases it automatically on Drop/finalization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Claude finished @schenksj's task —— View job Code Review: Unified JVM-Coordinated Native Memory Pool
|
The create_split_from_parquet() path allocates writer_heap_size (default 256MB) without a MemoryReservation. This is the last untracked writer_with_num_threads call in the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Claude finished @schenksj's task —— View job Code Review: Unified JVM-Coordinated Native Memory PoolI've completed a comprehensive review of this PR implementing a DataFusion Comet-style unified memory pool. Here are my findings: Todo List:
✅ Overall Assessment: STRONG IMPLEMENTATIONThis is a well-designed, production-ready implementation with excellent error handling and thread safety. The code follows good engineering practices and includes comprehensive testing. 🔧 Issues Found1. Critical: Unsafe Lifetime Transmutation
|
…shutdown Bug 1 - getCategoryBreakdown() empty after operations complete: Add per-category peak tracking (CategoryTracker with current + peak). New getCategoryPeakBreakdown() API returns historical maximums even after all reservations are released. Bug 2 - releaseMemory() called with more bytes than acquireMemory(): Cap jni_release to the reservation size being freed, preventing the global watermark batch release from exceeding per-task accounting. Bug 3 - NativeMemoryManager fails without Tantivy class pre-loading: Change static init in NativeMemoryManager, GlobalCacheConfig, and RuntimeManager to use Class.forName(Tantivy) for jar-embedded native library extraction, with System.loadLibrary fallback. Bug 4 - Shutdown hook calls releaseMemory() outside task context: Add shutting_down flag to JvmMemoryPool, shutdown() method on MemoryPool trait, NativeMemoryManager.shutdown() Java API. Called from SplitCacheManager shutdown hook before closing cache managers. Also: TrackedByteRangeCache wraps parquet byte range LRU with MemoryReservation for per-entry memory pool tracking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace get_env() with closure-based with_jni_env() to eliminate the unsafe transmute::<JNIEnv<'_>, JNIEnv<'static>> — the env borrow is now scoped to the closure, making the lifetime safe without transmute - Add debug_println logging in Drop when jni_release fails instead of silently swallowing the error - Add convergence comment on CAS peak-update loop explaining why it terminates quickly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Claude finished @schenksj's task —— View job Code Review: Unified JVM-Coordinated Native Memory PoolI've completed a comprehensive review of this PR implementing a DataFusion Comet-style unified memory pool. Here are my findings: Todo List:
✅ Overall Assessment: STRONG IMPLEMENTATIONThis is a well-designed, production-ready implementation with excellent error handling and thread safety. The code follows good engineering practices and includes comprehensive testing. 🔧 Issues Found1. Potential Race Condition: Compare-Exchange Loop
|

Summary
NativeMemoryAccountant, preventing OOM kills in containerized environments (issue Unified memory management: JVM-coordinated memory pool for native allocations #105)TieredCacheConfig.withMaxWriteQueueBudget())Key Components
MemoryPooltrait,JvmMemoryPool(high/low watermark batching, 64MB chunks),MemoryReservation(RAII guard),DiskCacheMemoryBudget(staircase-up/cliff-down with max cap),LimitedMemoryPool(test helper)NativeMemoryManager,NativeMemoryAccountantinterface,NativeMemoryStats,TieredCacheConfig.withMaxWriteQueueBudget()acquireMemory/releaseMemorycallbacks from Rust to Java accountantTest plan
DiskCacheMemoryBudget(cap, denial, drain+regrow, zero base)UnlimitedMemoryPool(categories, peak, thread safety)MemoryReservation(lifecycle, denial, grow/shrink, leak prevention)LimitedMemoryPool(capacity, denial, release)NativeMemoryManagerTest(set accountant, stats, categories, watermark batching)MemoryDenialTest(10MB limit, writer denial, pool stats after denial)cargo checkcompiles cleanlyUnlimitedMemoryPoolis backward-compatible)🤖 Generated with Claude Code