Skip to content

Added floating point types support for work_group_scan_exclusive tests#2533

Open
shajder wants to merge 4 commits into
KhronosGroup:mainfrom
shajder:add_work_group_scan_exc_fp
Open

Added floating point types support for work_group_scan_exclusive tests#2533
shajder wants to merge 4 commits into
KhronosGroup:mainfrom
shajder:add_work_group_scan_exc_fp

Conversation

@shajder
Copy link
Copy Markdown
Contributor

@shajder shajder commented Sep 24, 2025

Related to #1364

-adapted from KhronosGroup#2525 due to code review
@shajder shajder requested a review from rjodinchr June 3, 2026 10:34
Comment on lines +25 to +26
constexpr cl_half g_half_min = 0xFC00;
constexpr cl_half g_half_max = 0x7C00;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do do not keep the maximum finite values?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I've implemented that sep 2025, I don't recall the details right now and I will have to dive deeper. I will get back later.

static void generate_input_values(Type *inptr, size_t n_elems,
size_t max_wg_size,
const Type *max_err = nullptr)
Type *const max_err = nullptr)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why we want to add a const, but no need to remove the one already there, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two consts aren't the same: the old const Type * protects the pointee, but max_err is now written to (it's an out-param), so that const has to go; the new Type *const only locks the pointer, not the data.

Comment on lines +495 to +502
// to prevent overflow limit range of randomization
float max_range = 99.0;
float min_range = -99.0;
// generate reference values for one work group
for (size_t j = 0; j < max_wg_size; j++)
ref_vals[j] = cl_half_from_float(
get_random_float(min_range, max_range, d),
gHalfRoundingMode);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to select in that specific range? we do not have any range for integers.

Copy link
Copy Markdown
Contributor Author

@shajder shajder Jun 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integer add wraps deterministically and is order-independent, so reference and device match even on overflow, no range needed. FP isn't: scan order is unspecified, and once a partial sum saturates to positive or negative INF the value is lost and INF (or INF-INF -> NaN) can appear at different positions on device vs reference, causing false mismatches. The range just keeps every partial sum finite.

Comment on lines +504 to +509
// populate reference data across all work groups
for (size_t i = 0; i < (size_t)n_elems; i += max_wg_size)
{
size_t wg_size = std::min(max_wg_size, n_elems - i);
memcpy(&inptr[i], ref_vals.data(), sizeof(Type) * wg_size);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we duplicate the data? we generate random values for integers for each workgroup it seems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants