Skip to content

Release 0.15.0.#258

Open
cfallin wants to merge 2 commits intobytecodealliance:mainfrom
cfallin:release-0.15.0
Open

Release 0.15.0.#258
cfallin wants to merge 2 commits intobytecodealliance:mainfrom
cfallin:release-0.15.0

Conversation

@cfallin
Copy link
Member

@cfallin cfallin commented Feb 15, 2026

Includes #257.

(Stacked on top of #257 until that one merges.)

This addresses several longstanding issues we've had in Cranelift where
very large inputs can cause the frontend or mid-end to run out of VRegs
when generating or mutating IR.

For example, in a recent Wasmtime meeting we agreed that the
implementation limits (e.g., 7654321 bytes for a single Wasm function
body) are really minimum requirements for us to support as well as
maximum bounds for producers to make use of; any input that conforms to
those limits should be compilable (given enough CPU time and memory) by
the Cranelift backend.

The main limiter on the number of virtual registers we could support in
one function body was this crate, the register allocator, and its choice
to pack an `Operand` into 32 bits. The operand bundles together a
virtual register index with other dimensions, like def/use, early/late,
and constraints (fixed reg, reg, stack, reuse-input, etc.). We got the
other fields to fit into 11 bits, leaving 21 for the virtual register:
2^21, or 2M (2097152), virtual registers to use.

This was a performance optimization, as we had found that the memory
traffic and cache pressure caused by a larger `Operand` were measurable
and we wanted to avoid this overhead.

Nevertheless, as the limit continues to come up and the need is clear,
it seems to be time to re-evaluate. This PR thus bumps the limit to 2^30
VRegs by expanding `Operand` to 64 bits. The `Operand` bitpacking itself
allows for 2^32 VRegs, but `VReg` has to carry its register class around
(2 bits) and encodes into a `u32`, so 30 bits it is. A billion VRegs
should be enough for anybody. (Let's hope!)

Performance impact: I measured on spidermonkey-json.wasm and bz2.wasm
with Sightglass+Wasmtime+Cranelift. I found 0-2% compile time regression
on bz2 (mean ~1%), and 0-1% regression on SpiderMonkey (mean ~0.4%).
That's not nothing, but it's worth it to solve this issue, IMHO.
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for when this one is ready (agreed the other one should land before this though)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants