Skip to content

Native transaction log state reader with Arrow FFI export #100

@schenksj

Description

@schenksj

Summary

Add a new TransactionLogReader component to tantivy4java that reads the IndexTables transaction log state natively in Rust and exports it via Arrow FFI — the same pattern used by DeltaTableReader.readCheckpointPartArrowFfi().

Priority: P0

Motivation

Currently, every read operation starts by reading the transaction log state on the JVM side. For large tables this involves:

  • Reading potentially thousands of Avro manifest files, each GZIP-compressed
  • Allocating GenericDatumWriter/GenericDatumReader per manifest file
  • GZIP decompression with 8KB buffers allocated per file
  • JSON schema parsing for deduplication per entry
  • Parallel reads via Future.sequence with JVM thread pool overhead

This is the single largest driver-side bottleneck for cold query startup on large tables.

Proposed Approach

  1. New Rust component: TransactionLogReader that understands the IndexTables Avro manifest format (V4 state format)
  2. Operations to handle natively:
    • GZIP decompression (Rust's flate2 is significantly faster than Java's GZIPOutputStream)
    • Avro manifest deserialization (using apache-avro crate)
    • Schema deduplication (currently done via JSON parsing per-entry)
    • Partition filter evaluation during reading (reuse existing PartitionFilter infrastructure)
  3. Output: Arrow columnar batches via the existing Arrow C Data Interface FFI (same pattern as docBatchArrowFfi and readCheckpointPartArrowFfi)
  4. JNI interface (strawman):
    // Read all state from a transaction log directory, optionally filtered
    int readStateArrowFfi(
        String transactionLogPath,
        long[] arrayAddrs,
        long[] schemaAddrs,
        PartitionFilter partitionFilter  // optional, for partition pruning during read
    );

Expected Impact

  • 2-5x faster cold query startup on large tables (1000+ splits)
  • Eliminates per-manifest JVM object allocation overhead
  • Enables native partition pruning during state materialization

Dependencies

  • Avro V4 state format specification (see docs/reference/protocol.md in indextables_spark)
  • GZIP compression codec
  • Existing PartitionFilter infrastructure
  • Arrow C Data Interface FFI (already implemented)

Related

  • indextables/indextables_spark integration issue (to be linked)
  • Existing pattern: DeltaTableReader.readCheckpointPartArrowFfi()

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions