Optimize LCC reader for large scenes (~5.6x faster)#188
Merged
slimbuck merged 3 commits intoplaycanvas:mainfrom Mar 21, 2026
Merged
Optimize LCC reader for large scenes (~5.6x faster)#188slimbuck merged 3 commits intoplaycanvas:mainfrom
slimbuck merged 3 commits intoplaycanvas:mainfrom
Conversation
willeastcott
approved these changes
Mar 21, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Optimizes the XGrids LCC reader hot path to significantly improve load time and memory usage on very large scenes by decoding directly into preallocated output buffers, using typed-array views for faster access, and concurrently dispatching unit reads.
Changes:
- Decode rotations/SH coefficients without per-splat temporary allocations (write directly into output arrays).
- Remove per-unit intermediate buffers by writing unit data straight into shared global arrays.
- Add bounded-concurrency unit decoding and pre-combine selected LODs into a single
DataTable(plus environment table).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Optimizes the LCC (XGrids) reader to handle large scenes efficiently. Tested on a 139M splat scene (9GB on disk), load time drops from 200s to 36s (~5.6x speedup) with significantly reduced peak memory usage.
Changes
decodeRotationno longer creates a temporary array per call; SH decoding no longer creates 15Vec3objects per splat. NewdecodeRotationIntowrites directly to output arrays.processUnitwrites directly into the shared output arrays instead of allocating per-unit intermediate typed arrays and copying with.set().awaitper unit. No measurable impact for local disk, but benefits network-based file systems (browser URL loading).DataTable, eliminating the expensive post-readcombine()step which was allocating ~35GB of intermediate buffers and copies for large scenes.Float32Array,Uint16Array, andUint8Arrayviews over the input buffer instead ofDataView.get*()calls, avoiding per-access bounds checks and endianness handling overhead.Performance
Benchmarked on a 139,722,883 splat LCC scene (Grow-House.lcc):
Memory
combine()step (per-unit intermediate arrays + final column merge)Float32Arrayallocations removed (previously allocated and copied per unit, now writes directly to global arrays)