Skip to content

Migration to allocations#349

Open
LuckyIYI wants to merge 42 commits intomainfrom
buffers-to-allocations
Open

Migration to allocations#349
LuckyIYI wants to merge 42 commits intomainfrom
buffers-to-allocations

Conversation

@LuckyIYI
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown

@norpadon norpadon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally stumbled upon this PR and found a few things to nitpick

Comment thread crates/backend-uzu/src/encodable_block/linear/rht_wrapper.rs
Comment thread crates/uzu/src/encodable_block/mlp/dense.rs Outdated
Comment thread crates/backend-uzu/src/encodable_block/mlp/moe.rs Outdated
Comment thread crates/uzu/src/encodable_block/mlp/moe.rs Outdated
Comment thread crates/uzu/src/encodable_block/attention.rs Outdated
@LuckyIYI LuckyIYI requested a review from norpadon April 20, 2026 22:08
@LuckyIYI LuckyIYI changed the title Migration to allocations [wip] Migration to allocations Apr 20, 2026
@LuckyIYI LuckyIYI marked this pull request as ready for review April 20, 2026 23:27
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70ed067262

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/backend-uzu/src/backends/common/encoder.rs Outdated
Comment thread crates/backend-uzu/src/array.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/allocator/allocator.rs
Comment thread crates/backend-uzu/src/backends/common/allocator/allocator.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/kernel/attention.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/kernel/attention.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/kernel/mlp_gate_act_mul.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/encoder.rs Outdated
Comment thread crates/backend-uzu/src/encodable_block/tensor_add_swap.rs Outdated
Comment thread crates/backend-uzu/src/array.rs Outdated

let src = &self.suffix_state.as_bytes()[src_start..src_end];
let dst = self.conv_state.as_bytes_mut();
dst.copy_from_slice(src);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 dst.copy_from_slice(src);

This used to panic when length mismatch I think.

And now it silently clamps the len, is this behavior expected?

Comment thread crates/backend-uzu/src/encodable_block/linear/mod.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/encoder.rs Outdated
Comment thread crates/backend-uzu/src/encodable_block/decoder.rs Outdated
# Conflicts:
#	crates/backend-uzu/build/common/traitgen.rs
#	crates/backend-uzu/build/cpu/compiler.rs
#	crates/backend-uzu/src/backends/cpu/kernel/ssm/ssd_prefill.rs
#	crates/backend-uzu/src/backends/metal/kernel/matmul/mod.rs
#	crates/backend-uzu/src/encodable_block/classifier_layer.rs
#	crates/backend-uzu/src/encodable_block/layer/executables.rs
#	crates/backend-uzu/src/encodable_block/rope.rs
@LuckyIYI LuckyIYI requested review from CC-Yeh and uuuvn April 22, 2026 09:07
Comment thread crates/backend-uzu/src/backends/common/kernel/matmul.rs
Comment thread crates/backend-uzu/src/backends/common/kernel/sampling.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/command_buffer.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/encoder.rs Outdated
Comment thread crates/backend-uzu/src/backends/common/encoder.rs Outdated
Comment thread crates/backend-uzu/src/encodable_block/decoder.rs
Comment thread crates/backend-uzu/src/encodable_block/delta_net_mixer.rs Outdated
Comment thread crates/backend-uzu/src/encodable_block/embedding.rs Outdated
Comment thread crates/backend-uzu/src/encodable_block/embedding.rs Outdated
Comment thread crates/backend-uzu/src/encodable_block/layer_norm.rs Outdated
@LuckyIYI LuckyIYI requested a review from uuuvn May 1, 2026 23:11
Comment on lines -5 to 10
pub(super) fn read_float_array<B: Backend, const RANK: usize>(
tree: &BackendParameterTree<'_, B>,
pub(super) fn read_float_allocation<B: Backend, const RANK: usize>(
tree: &BackendParameterTree<B>,
name: &str,
expected_data_type: DataType,
) -> AudioResult<([usize; RANK], Array<B>)> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you renaming read_float_array to read_float_allocation when this function still returns array?

Comment on lines +34 to +35
let (shape, array) = read_float_array::<B, 1>(tree, name, expected_data_type)?;
let (shape, allocation) = read_float_allocation::<B, 1>(tree, name, expected_data_type)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous one:

Image

This is array, not an allocation

I won't flag more changes that are almost exactly like that, please read the diff before re-requesting review. A change that renames a bunch of functions and variables that were and are still arrays into allocations should never make it's way into a supposedly ready to review pull request.

Comment on lines +31 to +30
pub hidden: &'a mut B::Buffer,
pub hidden: &'a mut Allocation<B>,
/// Final output [d_model]
pub y: &'a mut B::Buffer,
pub y: &'a mut Allocation<B>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hidden is not an argument but an intermediate (the comment literally says this + it's fully overwritten in pass a without being read), y should similarly be a return of MoeExpertsSingleDecodeKernels::encode

Also looking at lsp references for this it looks like it's just a dead code (nothing outside of tests encodes this kernel), so it should be just deleted (in a separate prereq pr).

type Completed: CommandBufferCompleted<CommandBuffer = Self>;
}

type CommandBufferBackend<C> = <C as CommandBuffer>::Backend;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not have "getter types", things like Allocation<CommandBufferBackend<Self::CommandBuffer>> are much more confusing than Allocation<<Self::CommandBuffer as CommandBuffer>::Backend>

Comment on lines 82 to 107
pub fn encode_copy(
&mut self,
src: &B::Buffer,
src_range: Range<usize>,
dst: &mut B::Buffer,
dst_range: Range<usize>,
src: &Allocation<B>,
src_range: impl RangeBounds<usize>,
dst: &mut Allocation<B>,
dst_range: impl RangeBounds<usize>,
) {
let size = src_range.end - src_range.start;
debug_assert_eq!(size, dst_range.end - dst_range.start);
let (src_buffer, src_allocation_range) = src.as_buffer_range();
let (dst_buffer, dst_allocation_range) = dst.as_buffer_range();
let src_range = resolve_copy_range(src_range, src_allocation_range.len(), "source");
let dst_range = resolve_copy_range(dst_range, dst_allocation_range.len(), "destination");
let byte_len = src_range.len();
assert_eq!(byte_len, dst_range.len(), "copy range lengths must match");
assert!(byte_len > 0, "zero-sized copies are not allowed");
let src_access_range = src_allocation_range.start + src_range.start..src_allocation_range.start + src_range.end;
let dst_access_range = dst_allocation_range.start + dst_range.start..dst_allocation_range.start + dst_range.end;
self.access(&[
Access {
range: src.gpu_address_subrange(src_range.clone()),
range: src_buffer.gpu_address_subrange(src_access_range),
flags: AccessFlags::copy_read(),
},
Access {
range: dst.gpu_address_subrange(dst_range.clone()),
range: dst_buffer.gpu_address_subrange(dst_access_range),
flags: AccessFlags::copy_write(),
},
]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very confusing and it implements the same range add thing both here and in backend's implementation, this is not the right way to do things.

@LuckyIYI LuckyIYI requested a review from uuuvn May 6, 2026 04:02
pub down_biases: &'a Allocation<B>,
/// Hidden buffer [K, d_ff] - intermediate storage (f32)
pub hidden: &'a mut B::DenseBuffer,
pub hidden: &'a mut Allocation<B>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should not be an argument any more and but a part of MoeExpertsSingleDecodeKernels::encode

pub hidden: &'a mut Allocation<B>,
/// Final output [d_model]
pub y: &'a mut B::DenseBuffer,
pub y: &'a mut Allocation<B>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be returned by the encode

Comment on lines +159 to +160
pub hidden: &'a mut Allocation<B>,
pub output: &'a mut Allocation<B>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hidden & the output should be created inside

Comment on lines +356 to +362
let mut x = encoder.allocate_scratch(size_for_shape(
&[active_row_count, self.config.num_heads, self.config.head_dim],
self.data_type,
))?;
let bc_shape = [active_row_count, self.config.num_groups, self.config.state_dim];
let mut b = encoder.allocate_scratch(size_for_shape(&bc_shape, self.data_type))?;
let mut c = encoder.allocate_scratch(size_for_shape(&bc_shape, self.data_type))?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe x, b, c should be allocated inside run_conv_scan and returned bound in a struct

Comment on lines +345 to +353
let in_proj = self.in_projection.encode(input, active_row_count, encoder)?;
let mut conv_inputs =
encoder.allocate_scratch(size_for_shape(&[active_row_count, self.config.conv_dim()], self.data_type))?;
let mut z = encoder.allocate_scratch(size_for_shape(
&[active_row_count, self.config.num_heads, self.config.head_dim],
self.data_type,
))?;
let mut dt =
encoder.allocate_scratch(size_for_shape(&[active_row_count, self.config.num_heads], self.data_type))?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like in_proj, z and dt should be allocated inside self.run_split_inproj and returned as a struct

ArrayId::AttentionOutput,
ArrayId::Main,
)
.expect("Failed to create out-projection kernel");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace expect with error

model_dim: usize,
decoder_layer_loader: &ParameterTree<B::Context>,
) -> (Self, Option<B::DenseBuffer>) {
) -> (Self, Option<Allocation<B>>) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should return a Result

) -> (Self, Option<B::DenseBuffer>) {
) -> (Self, Option<Allocation<B>>) {
if !matches!(layer_type, DecoderLayerType::ShortConv { .. }) {
panic!("Layer {} marked as non-ShortConv but ShortConv config provided", layer_index);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace panic! with error

ArrayId::Main,
ArrayId::SsmInProj,
)
.expect("Failed to create in-projection kernel");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace expect with error

ArrayId::AttentionOutput,
ArrayId::Main,
)
.expect("Failed to create out-projection kernel");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace expect with error

@LuckyIYI LuckyIYI requested a review from eugenebokhan May 8, 2026 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants