Operand/VReg encoding: widen VReg index by doubling `Operand` size. by cfallin · Pull Request #257 · bytecodealliance/regalloc2

cfallin · 2026-02-15T00:32:00Z

This addresses several longstanding issues we've had in Cranelift where very large inputs can cause the frontend or mid-end to run out of VRegs when generating or mutating IR.

For example, in a recent Wasmtime meeting we agreed that the implementation limits (e.g., 7654321 bytes for a single Wasm function body) are really minimum requirements for us to support as well as maximum bounds for producers to make use of; any input that conforms to those limits should be compilable (given enough CPU time and memory) by the Cranelift backend.

The main limiter on the number of virtual registers we could support in one function body was this crate, the register allocator, and its choice to pack an Operand into 32 bits. The operand bundles together a virtual register index with other dimensions, like def/use, early/late, and constraints (fixed reg, reg, stack, reuse-input, etc.). We got the other fields to fit into 11 bits, leaving 21 for the virtual register: 2^21, or 2M (2097152), virtual registers to use.

This was a performance optimization, as we had found that the memory traffic and cache pressure caused by a larger Operand were measurable and we wanted to avoid this overhead.

Nevertheless, as the limit continues to come up and the need is clear, it seems to be time to re-evaluate. This PR thus bumps the limit to 2^30 VRegs by expanding Operand to 64 bits. The Operand bitpacking itself allows for 2^32 VRegs, but VReg has to carry its register class around (2 bits) and encodes into a u32, so 30 bits it is. A billion VRegs should be enough for anybody. (Let's hope!)

Performance impact: I measured on spidermonkey-json.wasm and bz2.wasm with Sightglass+Wasmtime+Cranelift. I found 0-2% compile time regression on bz2 (mean ~1%), and 0-1% regression on SpiderMonkey (mean ~0.4%). That's not nothing, but it's worth it to solve this issue, IMHO.

This addresses several longstanding issues we've had in Cranelift where very large inputs can cause the frontend or mid-end to run out of VRegs when generating or mutating IR. For example, in a recent Wasmtime meeting we agreed that the implementation limits (e.g., 7654321 bytes for a single Wasm function body) are really minimum requirements for us to support as well as maximum bounds for producers to make use of; any input that conforms to those limits should be compilable (given enough CPU time and memory) by the Cranelift backend. The main limiter on the number of virtual registers we could support in one function body was this crate, the register allocator, and its choice to pack an `Operand` into 32 bits. The operand bundles together a virtual register index with other dimensions, like def/use, early/late, and constraints (fixed reg, reg, stack, reuse-input, etc.). We got the other fields to fit into 11 bits, leaving 21 for the virtual register: 2^21, or 2M (2097152), virtual registers to use. This was a performance optimization, as we had found that the memory traffic and cache pressure caused by a larger `Operand` were measurable and we wanted to avoid this overhead. Nevertheless, as the limit continues to come up and the need is clear, it seems to be time to re-evaluate. This PR thus bumps the limit to 2^30 VRegs by expanding `Operand` to 64 bits. The `Operand` bitpacking itself allows for 2^32 VRegs, but `VReg` has to carry its register class around (2 bits) and encodes into a `u32`, so 30 bits it is. A billion VRegs should be enough for anybody. (Let's hope!) Performance impact: I measured on spidermonkey-json.wasm and bz2.wasm with Sightglass+Wasmtime+Cranelift. I found 0-2% compile time regression on bz2 (mean ~1%), and 0-1% regression on SpiderMonkey (mean ~0.4%). That's not nothing, but it's worth it to solve this issue, IMHO.

alexcrichton

Code-wise I'm pretty unfamiliar with all of this, so I can give a review from a style/rust/etc perspective but I can't review much from a semantics perspective in a broader sense. It makes sense that this is widening the Operand type but I would basically just be rubber-stamping your own judgement on this in terms of whether it's the best way to achieve this and such (not that I have any better ideas myself). Basically it might be good to have @fitzgen look at this too as I think he's more familiar with the register allocator than I.

Review-wise though there's a lot of magic numbers that are updated here as things are shifted around, and I'd personally find it easier to give things symbolic names which are all derived from one another to make it more obvious what's being affected, but I understand that might be best done as a follow-up as opposed to here too

cfallin assigned alexcrichton Feb 15, 2026

cfallin mentioned this pull request Feb 15, 2026

Release 0.15.0. #258

Open

cfallin unassigned alexcrichton Feb 15, 2026

cfallin requested a review from alexcrichton February 15, 2026 00:35

cfallin mentioned this pull request Feb 15, 2026

Code for function is too large、High memory usage bytecodealliance/wasmtime#12229

Open

alexcrichton approved these changes Feb 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operand/VReg encoding: widen VReg index by doubling `Operand` size.#257

Operand/VReg encoding: widen VReg index by doubling `Operand` size.#257
cfallin wants to merge 1 commit intobytecodealliance:mainfrom
cfallin:we-have-run-out-of-bits-and-it-is-time-to-order-more

cfallin commented Feb 15, 2026

Uh oh!

alexcrichton left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cfallin commented Feb 15, 2026

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants