Sparse heaps by agolokoz · Pull Request #392 · trymirai/uzu

agolokoz · 2026-05-07T17:20:19Z

No description provided.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbc3b55b61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

agolokoz · 2026-05-08T12:35:35Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b6d5015647

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

CC-Yeh

What do you think about having a warm pool of heaps to reduce TTFT? we can trade some memory for latency. Might be useful for short inputs.

CC-Yeh · 2026-05-08T12:55:23Z

        let device_capabilities = MetalDeviceCapabilities::from_device(&device);

+        let page_size = MTLSparsePageSize::KB256;
+        let heap_capacity = 64 * 4 * page_size.byte_size().as_u64() as usize;


Why 64MB? Any trade-offs?

Maybe we could make this configurable and smaller for iphones, bigger for macs

Maybe we could make this configurable and smaller for iphones, bigger for Macs

Or even for different cases we can have different pools

Sizes will be set up together with first usage of sparse buffers. For now I just add some value

CC-Yeh · 2026-05-08T13:02:20Z

+            })
+            .collect();
+
+        cmd_queue.update_buffer_mappings(buffer, Some(&self.heap), &mtl_operations);


potential race? Do we need fence/barrier for intra/inter queue cases?

Caller must be responsible for synchronizations because in case of many maps, for example, it's not good idea to synchronize every map

Thanks, how can we enforce this? Maybe tests + documentation can help

CC-Yeh · 2026-05-08T14:16:52Z

+
+#[derive(Clone, PartialEq)]
+pub(super) struct MetalSparseHeapBufferMapping {
+    gpu_address: u64,


Could gpu_address be brittle if Drop fails or finished partially?
Maybe store a pointer to buffer? Or we could store the mapping on buffer side?

Could gpu_address be brittle if Drop fails or finished partially?

How it's possible?
Anyway after drop() call object is not usable anymore.

Or I didn't understand what you mean.

The pool keys is bookkeeping by gpu_address, a value Metal can reassign to a new buffer later, so any orphan entry left behind by a partial Drop becomes indistinguishable from a live entry.

But in case of duplicate entries, probably first unmap would release the heap and second one is a no-op.

CC-Yeh · 2026-05-08T14:21:21Z

+                }
+            }
+
+            heap.execute(buffer, &context.command_queue4, &mappings, true);


early return when no overlapping to skip expensive op?

There are nothing expensive inside if mappings are empty

My understanding is that cpu will talk to GPU even if mappings is empty, maybe this is cheap I'm not sure.

agolokoz · 2026-05-08T15:09:07Z

What do you think about having a warm pool of heaps to reduce TTFT? we can trade some memory for latency. Might be useful for short inputs.

Planned to do in the future

CC-Yeh

LGTM overall, let's also get a pass from @uuuvn

agolokoz added 6 commits May 7, 2026 00:12

WIP

c4cf678

Implement mapping

57ee57b

Implement unmapping and add tests

091e2ff

Fixes

c084a65

Refactoring

1c34838

Add sparse_heap_pool_test.rs

dbc3b55

agolokoz marked this pull request as ready for review May 8, 2026 10:30

agolokoz requested review from CC-Yeh, LuckyIYI, eugenebokhan and uuuvn as code owners May 8, 2026 10:30

chatgpt-codex-connector Bot reviewed May 8, 2026

View reviewed changes

Comment thread crates/backend-uzu/src/backends/metal/sparse/sparse_heap_pool.rs

Comment thread crates/backend-uzu/src/backends/metal/sparse/sparse_buffer.rs

agolokoz added 3 commits May 8, 2026 15:44

Refactoring

3287586

Update heap metadata after partial unmaps

5357ebe

Release pool mappings when sparse buffers are dropped

b6d5015

chatgpt-codex-connector Bot reviewed May 8, 2026

View reviewed changes

Comment thread crates/backend-uzu/src/backends/metal/sparse/sparse_buffer.rs

CC-Yeh reviewed May 8, 2026

View reviewed changes

Println errors in drop()

29320c0

agolokoz requested a review from CC-Yeh May 8, 2026 15:24

CC-Yeh approved these changes May 8, 2026

View reviewed changes

Conversation

agolokoz commented May 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

agolokoz commented May 8, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

CC-Yeh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agolokoz commented May 8, 2026

Uh oh!

CC-Yeh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants