Skip to content

conversions: Refactor tests to reduce dataset size#2711

Open
rjodinchr wants to merge 2 commits into
KhronosGroup:mainfrom
rjodinchr:conversions
Open

conversions: Refactor tests to reduce dataset size#2711
rjodinchr wants to merge 2 commits into
KhronosGroup:mainfrom
rjodinchr:conversions

Conversation

@rjodinchr
Copy link
Copy Markdown
Collaborator

This commit reduces the overall dataset size for conversion tests
while maintaining high confidence in edge-case coverage. By replacing
hardcoded arrays with a dynamically generated set of problematic
corner cases, the test execution time and memory footprint are
optimized without sacrificing strict conformance validation.

It also introduces an exhaustive testing mode (-a flag) to test 2^32
inputs when full coverage is explicitly desired, and corrects the
workload reduction math for Wimpy and Embedded modes. Thread safety is
guaranteed via a global std::recursive_mutex, and exact bit-level
deduplication using std::unordered_set prevents redundant
floating-point testing (e.g., collapsing NaNs or signed zeros).

Special Value Selection Strategies:

  • Integer Types (Base and Conversions)
    Focuses on a dense cluster of known boundary triggers rather than
    sparse randoms. The generator populates a baseline pool (0 to 255),
    calculates powers of 2, 3, 5, 7, and 10 up to the sign bit, and
    creates bitwise combinations of repeating patterns and shifting
    masks. Crucially, to catch off-by-one and alignment errors, every
    base value generates its immediate neighbors (offsets from -3 to
    +3), along with their bitwise NOT and sign-bit XORs.

  • Floating-Point Types
    Focuses strictly on precision loss and rounding boundaries. It seeds
    a core set of known difficult values (-/+ INF, NaN, subnormals, and
    type limits). To test exact rounding thresholds, the generator
    calculates the +/- 1 ULP neighbors for every seeded value via
    integer bitcasting. Finally, to guarantee coverage of saturation and
    overflow thresholds, cross-type boundary values corresponding to the
    specific destination type's domain limits (e.g., injecting exact
    char/short limits into float inputs) are dynamically added to the
    test set.

The primary goal of this commit is to improve CI tracking by
introducing a new golden format that can differentiate test results
based on command-line arguments. To cleanly extract and pass these
arguments into the JSON result outputs, the command-line parsing
infrastructure across the CTS required a significant refactoring.

Key changes include:
* Enhanced CI Tracking: Updates `ci/compare_results.py`,
`ci/pocl/golden.json`, and `saveResultsToJson` to include and evaluate
an `args` key. The golden JSON now uses a nested format mapping
specific argument strings (e.g., `--wimpy -1`) to their expected
results, allowing the CI to validate the same binary run under
different parameters.
* Centralized Parsing Infrastructure: Introduces the `ParseArgsFn`
callback and `runTestHarnessWithCheckAndParse`. This offloads custom
argument parsing from individual test `main()` functions and safely
extracts the arguments used so they can be logged by the test harness.
* Help Text Consolidation: Replaces fragmented `printUsage()`
functions with unified `help` string references populated directly by
the standard parsing callbacks.

[run-test: test_computeinfo]
[run-test: test_bruteforce -1 -w]
[run-test: test_cl_copy_images small_images --num-worker-threads 2 1D]
[run-test: test_image_streams 1D --num-worker-threads 2 CL_R CL_FILTER_NEAREST]
@rjodinchr
Copy link
Copy Markdown
Collaborator Author

Depends on #2706

This commit reduces the overall dataset size for conversion tests
while maintaining high confidence in edge-case coverage. By replacing
hardcoded arrays with a dynamically generated set of problematic
corner cases, the test execution time and memory footprint are
optimized without sacrificing strict conformance validation.

It also introduces an exhaustive testing mode (`-a` flag) to test 2^32
inputs when full coverage is explicitly desired, and corrects the
workload reduction math for Wimpy and Embedded modes. Thread safety is
guaranteed via a global `std::recursive_mutex`, and exact bit-level
deduplication using `std::unordered_set` prevents redundant
floating-point testing (e.g., collapsing NaNs or signed zeros).

Special Value Selection Strategies:

* Integer Types (Base and Conversions)
  Focuses on a dense cluster of known boundary triggers rather than
  sparse randoms. The generator populates a baseline pool (0 to 255),
  calculates powers of 2, 3, 5, 7, and 10 up to the sign bit, and
  creates bitwise combinations of repeating patterns and shifting
  masks. Crucially, to catch off-by-one and alignment errors, every
  base value generates its immediate neighbors (offsets from -3 to
  +3), along with their bitwise NOT and sign-bit XORs.

* Floating-Point Types
  Focuses strictly on precision loss and rounding boundaries. It seeds
  a core set of known difficult values (-/+ INF, NaN, subnormals, and
  type limits). To test exact rounding thresholds, the generator
  calculates the +/- 1 ULP neighbors for every seeded value via
  integer bitcasting. Finally, to guarantee coverage of saturation and
  overflow thresholds, cross-type boundary values corresponding to the
  specific destination type's domain limits (e.g., injecting exact
  char/short limits into float inputs) are dynamically added to the
  test set.
@ahesham-arm
Copy link
Copy Markdown
Collaborator

I would be interested to know if you have any metrics that you can share, e.g. execution times and peak memory usage before and after your change.

@rjodinchr
Copy link
Copy Markdown
Collaborator Author

I don't have memory usage numbers, but for the execution time it is about 40 times faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants