Use CU2BYTES for byte sizing in two allocation sites#909
Conversation
|
Thank you! This looks legitimate. If you used AI, would you mind sharing what tool (and prompt) you used? I will confirm it manually myself. I'm trying to get a bit better on hygiene around integer overflows - so ideally we would improve the code further: (In fact, I clearly need a helper for allocate-with-multiply-check. But that's for another PR.) Would you be able to add the check to both sites you've improved? |
|
Yeah, I'll take a look. AI was used to draft PR & validate finding when looking at something else. For reference: https://github.com/iliaal/whetstone/tree/main/plugins/whetstone/skills/ia-code-review Were used across multiple LLM (Claude, Codex, Deepseek v4 Pro) cross checks |
Two allocation sites multiplied by PCRE2_CODE_UNIT_WIDTH (the bit width: 8, 16, or 32) where the CU2BYTES(x) byte-count helper is intended. The result over-allocates by the code-unit byte width: 8x in 8-bit mode, 16x in 16-bit, 32x in 32-bit. Subsequent memcpy calls already use CU2BYTES correctly, so no out-of-bounds write occurs; the over-allocation is leaked until the buffer is freed. Also guard each site against integer overflow in sizeof(pcre2_memctl) + CU2BYTES(N + 1) by rejecting N greater than (PCRE2_SIZE_MAX - sizeof(pcre2_memctl)) / CU2BYTES(1) - 1.
4d9acdd to
9763f95
Compare
|
Added the guard at both sites. |
|
Those are some very interesting skills! I have access to Claude & GPT. I would be interested in doing a more thorough audit of all of PCRE2... at some point. I'll bear in mind that your tool might be useful. |
Two allocation sites multiply by
PCRE2_CODE_UNIT_WIDTH(the bit width: 8, 16, or 32) where the byte-count helperCU2BYTES(x) = (x) * (PCRE2_CODE_UNIT_WIDTH/8)is intended. The result over-allocates by the code-unit byte width: 8x in 8-bit mode, 16x in 16-bit, 32x in 32-bit.Subsequent
memcpycalls already useCU2BYTEScorrectly, so no out-of-bounds write occurs. The over-allocation leaks the difference until the buffer is freed, on every call.Sites:
src/pcre2_substring.c:214(pcre2_substring_get_bynumber)src/pcre2_convert.c:1219(pcre2_pattern_convert)Repo-wide grep for
* PCRE2_CODE_UNIT_WIDTHoutside preprocessor conditionals turned up only these two sites. Found via internal audit of PCRE2 10.47.Verified clean against the existing test suite via
cmakebuild +RunTest(29 tests, all pass). Tests 24 and 25 exercisepcre2_pattern_convert; substring tests in test 1 exercisepcre2_substring_get_bynumber.