Block-based immutable list implementation (GSoC proposal)#809
Open
Zayd-R wants to merge 5 commits intotypelevel:masterfrom
Open
Block-based immutable list implementation (GSoC proposal)#809Zayd-R wants to merge 5 commits intotypelevel:masterfrom
Zayd-R wants to merge 5 commits intotypelevel:masterfrom
Conversation
Introduces BlockedList (copy-on-write) and BlockedLostCopy (write-direct) as proposed in typelevel#634. Includes JMH benchmarks comparing both implementations against scala.List across prepend, uncons, foldLeft, and foreach.
Author
|
I just noticed i named the implementaion that writes directly with |
Collaborator
|
@Zayd-R thank you for working on this. There are few things that I'd like you to fix in your changeset:
|
…eaders. Please let me know if anything else needs attention
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is an early-stage implementation of the block-based immutable list
proposed in #634, submitted as part of GSoC work. The goal of this PR is to share the implementation and benchmark data to explore the design space before committing to a final approach.
Two implementations explored
BlockedList— copy-on-writeEvery prepend into dead space copies the valid portion of the block before
writing. Fully persistent and safe for branching use cases.
FastBlockedList— write-directPrepend writes directly into dead space (
offset - 1) without copying,since that slot has never been pointed at by any existing node and is
invisible to all observers. when the block is full, a fresh block is allocated and
the current node becomes the tail.
Both implementations store
BlockSizeper node to test it with many sizes in the benchmark file withoutrecompilation.
Results
All times in ns/op. Lower is better.
prepend (build a list of 10k elements from empty)
Copy-on-write prepend scales linearly with blockSize due to the arraycopy
cost. Write-direct prepend is flat across block sizes and only ~30% slower
than
scala.List.foreach (visit every element)
Both implementations beat
scala.Listby ~4x at blockSize=64. Thisis the cache locality benefit the proposal predicted — larger blocks mean
longer tight array loops with fewer pointer jumps.
foldLeft (sum all elements)
FastBlockedList.foldLefttiesscala.Listat blockSize=64(28,918 vs 28,865 ns/op). The larger error margins suggest JIT
variance — more iterations would tighten these numbers.
uncons (element-by-element traversal)
unconsis slower thanscala.Listas expected — each call allocatesone
Someand oneTuple2. As noted in the proposal,unconsis notthe intended traversal API. The
foreach/foldLeftresults above arethe relevant comparison.
map (apply a function to every element)
BlockedList.mapbeatsscala.Listby ~47% at blockSize=64 (30,755 ns vs 58,367 ns).The improvement scales with blockSize — larger blocks mean more elements processed
per block , confirming cache locality advantage.
scala.Listis flat across all block sizes as expected since it has no block structure.Key findings
foreachvalidates the proposal's cache locality claim — ~4x fasterthan
scala.Listat blockSize=64 for both implementationsfoldLefttiesscala.Listat larger block sizeslinear arraycopy cost
scala.ListBenchmark methodology
Tool: JMH (Java Microbenchmark Harness) via sbt-jmh plugin
Mode: Average time (
AverageTime)Units: nanoseconds per operation (ns/op) — lower is better
Warmup: 5 iterations
Measurement: 10 iterations
Forks: 1
Threads: 1
Environment: JVM [openjdk 25.0.2 2026-01-20 LTS],
CPU [Intel Core 5 210H],
RAM [16GB RAM],
OS [Ubuntu 22.04]
Lists are pre-built in
@Setup(Level.Trial)so construction cost isexcluded from traversal measurements. The benchmark suite is included
in
bench/src/main/scala/cats/bench/BlockedListBenchmark.scalaandcan be reproduced with:
Questions
FastBlockedListacceptable , or should only the copy-on-write version be pursued?Transparency note
English is not my first language. I used an LLM to help
with grammar and formatting in this PR description, and to generate the
initial benchmark boilerplate code. All implementation decisions, the
identification of bugs, the analysis of benchmark results, and the core
data structure logic were worked out by me. The AI was used as a writing
and tooling aid, not as a substitute for understanding.