Skip to content

Implement AdvancedIndexingTask using thrust::copy_if #1226

@manopapad

Description

@manopapad

See #1215 (comment). Quoting relevant part here:

I also have a quick question about memory usage. Since it worked fine on the other cluster I wanted to see the maximum size of array that will work. I noticed that the operation a_train = a[train_mask] requires peak memory of about 2.5 times the size of the original matrix a, when the mask train_mask selects 50% of the rows (train_mask is 1D array).

My expectation was a memory footprint of about 1.5x (1x for a + 0.5x for a_train). Can you confirm if this 2.5x peak memory usage is expected for boolean mask indexing? If so, could you briefly explain why the temporary memory requirement is so high?

The current implementation of the "masked copy" operation first calculates the offsets of the non-zeroes (which uses an array of size equal to that of the original mask array), then creates the output array and uses the offsets to fill it in. Therefore, the memory overhead you quote is expected, under the existing code. See https://github.com/nv-legate/cupynumeric/blob/main/src/cupynumeric/index/advanced_indexing.cu#L120.

But possibly for the case of a dense array we could use thrust::copy_if, which has lower memory requirements.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions