Skip to content

Fix: stop consuming fuzzer bytes for single-choice .values / .pointers on arrays#8

Closed
N3ur0sis wants to merge 1 commit into
mainfrom
fix/single-value-domain-no-fuzzer-bytes
Closed

Fix: stop consuming fuzzer bytes for single-choice .values / .pointers on arrays#8
N3ur0sis wants to merge 1 commit into
mainfrom
fix/single-value-domain-no-fuzzer-bytes

Conversation

@N3ur0sis
Copy link
Copy Markdown
Contributor

For a field with .values or .pointers, the sampler always read one byte from the fuzzer and did off += 1 to pick an index, even when there is only one allowed choice (data[off] % 1 is always 0).

For a large global array (e.g. uint8_t big_pool[16000] with .values = .{"\x00"}), the generated C still loops over every element, so the fuzzer prefix grew by 16 000 useless bytes instead of 0. That inflated ABSOLUTION_GLOBALS_SIZE, seed size, and wasted mutation budget before the harness sees useful data (e.g. other globals). Before (generated sample_invariant excerpt)

memset(big_pool, 0, sizeof(big_pool)); 
for (size_t i0 = 0; i0 < 16000; i0++) { 
   size_t idx_FM_VAL_0 = data[off] % 1; 
   memcpy(&big_pool[0 + i0 * 1], &FM_VAL_0[idx_FM_VAL_0 * 1], 1); off += 1; 
} 

ABSOLUTION_GLOBALS_SIZE was 1600. After

 memset(big_pool, 0, sizeof(big_pool)); 
for (size_t i0 = 0; i0 < 16000; i0++) { 
   memcpy(&big_pool[0 + i0 * 1], FM_VAL_0, 1); 
}

See tests/big_pool/

…ains

When .values or .pointers has at most one choice, the index is fixed;
skip data[off] and off += 1 in the sampler and charge 0 bytes in
neededBytesFromGlobals. Add tests/big_pool regression (see README).
Copilot AI review requested due to automatic review settings March 25, 2026 15:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a seed-size / input-consumption inefficiency in the C sampler generation by avoiding consuming a fuzzer byte when a .values or .pointers domain has only a single allowed choice, and adds an integration test that would previously inflate ABSOLUTION_GLOBALS_SIZE for large arrays.

Changes:

  • Update seed byte accounting so .values / .pointers domains with len <= 1 contribute 0 consumed bytes.
  • Update C sampler emission to skip data[off] reads and off += 1 when there is only one possible .values / .pointers choice.
  • Add a regression test (tests/big_pool/) covering a large array with a single-value .values domain plus an additional global to validate the prefix stays small.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/seed.zig Makes neededBytesFromGlobals return 0 bytes for single/empty .values and .pointers domains to match sampler behavior.
src/cgen/emit.zig Adjusts emitted sampler code to avoid consuming input bytes when the domain has only one choice.
tests/big_pool/target.c Adds a large global array and a secondary global used by the regression test.
tests/big_pool/target.c.in Provides an invariant constraining the large array to a single allowed value.
tests/big_pool/target.c.zon Adds the golden expected .zon output for the new test.
tests/big_pool/README.md Documents the regression scenario the test is intended to catch.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@0pendev
Copy link
Copy Markdown
Collaborator

0pendev commented Mar 30, 2026

I am wondering if our current emition strategy is just wrong and should be reworked.
In our invariant we can express values using byte arrays.
So our given a list of possible values it should always consume 1 byte not more (it only needs a byte to chose what value to apply).

The current behavior is to use values for every element of an array.
I think it is wrong and this strategy only makes sense for arrays of array.

Wdyt ?

@0pendev
Copy link
Copy Markdown
Collaborator

0pendev commented Mar 30, 2026

I tried an alternative approach here
Let me know if that works for you 🤔

I've added a field type "whole_field" to keep the current behavior that can be useful in some cases

@0pendev
Copy link
Copy Markdown
Collaborator

0pendev commented Apr 2, 2026

Closing in favor of #10

@0pendev 0pendev closed this Apr 2, 2026
@0pendev 0pendev deleted the fix/single-value-domain-no-fuzzer-bytes branch April 9, 2026 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants