Skip to content

fix: security hardening across crypto, storage, and index#30

Merged
jaywyawhare merged 2 commits into
masterfrom
fix/security-hardening
Apr 27, 2026
Merged

fix: security hardening across crypto, storage, and index#30
jaywyawhare merged 2 commits into
masterfrom
fix/security-hardening

Conversation

@jaywyawhare
Copy link
Copy Markdown
Owner

Summary

  • Replace weak rand() fallback with getrandom//dev/urandom in auth.c and crypto.c; fail hard instead of silently using insecure RNG
  • Add integer overflow guards before capacity * 2 and count * dimension * sizeof(float) in soa_storage.c, database.c, bm25.c, metadata_index.c, lsh.c
  • Check all fwrite return values in bm25_save via BM25_FWRITE macro; abort and clean up on partial write
  • Replace strcat chains in memory_consolidation.c with explicit memcpy + length tracking

Test plan

  • cmake -DBUILD_TESTS=ON -DENABLE_SANITIZERS=ON .. && ctest passes with ASAN+UBSAN
  • cppcheck --enable=warning,performance,portability src/ reports no new errors
  • Encryption key generation fails cleanly when /dev/urandom is unavailable instead of producing predictable keys

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 27, 2026

Greptile Summary

This PR hardens security and reliability across crypto/auth (replacing rand() with getrandom//dev/urandom and hard-failing), adds integer overflow guards before capacity multiplications in five files, validates all fwrite return values in bm25_save, and replaces unsafe strcat chains in memory_consolidation.c with memcpy + position tracking.

  • P1 – BM25_FWRITE leaves corrupt file on disk: on any write failure the macro closes but does not unlink the file, so a subsequent bm25_load call will silently read a partially written index. Adding remove(filepath) inside the macro before returning fixes this.
  • P2 – Division by zero in new overflow guards: SIZE_MAX / dimension / sizeof(float) in soa_storage_create and db_compact_soa_storage is undefined behaviour when dimension == 0; a prior dimension == 0 check should be added.
  • P2 – Partial key in memory on crypto_generate_key failure: if the first RNG call succeeds but the second fails, key->key contains live material while the function returns -1; calling crypto_wipe_key(key) before returning prevents accidental reuse.

Confidence Score: 4/5

Safe to merge after fixing the partial-file issue in BM25_FWRITE; the remaining findings are low-risk quality improvements.

One P1 finding remains: bm25_save silently leaves a corrupted file on disk when any write fails, which can cause silent data corruption on the next load. The two P2 findings (division-by-zero on dimension=0 and partial key on RNG failure) are defensive hardening that won't trigger in normal usage but are worth addressing. The RNG hardening, overflow guards, and strcat replacement are all correct and well-implemented.

src/multimodal/bm25.c (partial file on write failure), src/storage/soa_storage.c and src/storage/database.c (division-by-zero guard).

Important Files Changed

Filename Overview
src/multimodal/bm25.c Adds BM25_FWRITE macro for write-error detection and overflow guard in add_posting; macro correctly unlocks and closes on failure but leaves a partial file on disk (P1).
src/security/crypto.c Replaces weak rand() fallback with getrandom//dev/urandom and hard-fails; partial key leakage on second RNG failure in crypto_generate_key is a minor concern.
src/security/auth.c Mirrors crypto.c RNG hardening; return-value check propagated to callers correctly with lock released on failure.
src/storage/soa_storage.c Adds stdint.h and overflow guards for capacity doubling and initial allocation; SIZE_MAX / dimension can divide by zero if dimension == 0.
src/storage/database.c Adds overflow guard before compaction malloc; same dimension == 0 division-by-zero risk as soa_storage.c.
src/storage/memory_consolidation.c Replaces strcat chains with explicit memcpy + position tracking; correct and safe.
src/multimodal/metadata_index.c Adds overflow guard before capacity doubling in metadata_index_add; correct.
src/index/lsh.c Adds overflow guard in bucket_add with a proper capacity > 0 check before the multiplication; correct.

Reviews (1): Last reviewed commit: "fix: eliminate weak RNG, integer overflo..." | Re-trigger Greptile

Comment thread src/multimodal/bm25.c Outdated
Comment on lines +618 to +621
#define BM25_FWRITE(ptr, sz, n) \
do { if (fwrite((ptr), (sz), (n), fp) != (n)) { \
pthread_rwlock_unlock((pthread_rwlock_t *)&index->rwlock); \
fclose(fp); return -1; } } while (0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Partial file left on disk after write failure

When BM25_FWRITE detects a failed write, it closes the file and returns -1 but never unlinks it. A partial file at filepath is then left on disk; the next call to bm25_load will successfully open the file, read the corrupt header/body, and silently return bad state (or crash). The fix is to unlink the file before returning on error.

Suggested change
#define BM25_FWRITE(ptr, sz, n) \
do { if (fwrite((ptr), (sz), (n), fp) != (n)) { \
pthread_rwlock_unlock((pthread_rwlock_t *)&index->rwlock); \
fclose(fp); return -1; } } while (0)
#define BM25_FWRITE(ptr, sz, n) \
do { if (fwrite((ptr), (sz), (n), fp) != (n)) { \
pthread_rwlock_unlock((pthread_rwlock_t *)&index->rwlock); \
fclose(fp); remove(filepath); return -1; } } while (0)

Comment thread src/storage/soa_storage.c
Comment on lines 18 to +21
storage->count = 0;
storage->capacity = (initial_capacity > 0) ? initial_capacity : 1024;

if (storage->capacity > SIZE_MAX / dimension / sizeof(float)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Division by zero when dimension == 0

The new guard SIZE_MAX / dimension / sizeof(float) performs integer division with dimension before checking whether it is non-zero. If a caller passes dimension = 0, this is undefined behaviour. Add a dimension == 0 check before this guard (the same applies to the analogous check added in database.c).

Suggested change
storage->count = 0;
storage->capacity = (initial_capacity > 0) ? initial_capacity : 1024;
if (storage->capacity > SIZE_MAX / dimension / sizeof(float)) {
if (dimension == 0 || storage->capacity > SIZE_MAX / dimension / sizeof(float)) {
free(storage);
return NULL;
}

Comment thread src/security/crypto.c
Comment on lines 335 to 338
return 0;
}

int crypto_generate_key(GV_CryptoKey *key) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Partial key left in memory on second RNG failure

If generate_random_bytes(key->key, 32) succeeds but generate_random_bytes(key->iv, 16) fails, the function returns -1 with key->key populated with 32 bytes of live keying material and key->iv uninitialized. Callers that don't check the return value (or that log the struct on error) may inadvertently expose or misuse the partial key. Wiping the key before returning is safer:

if (generate_random_bytes(key->key, 32) != 0 ||
    generate_random_bytes(key->iv, 16) != 0) {
    crypto_wipe_key(key);
    return -1;
}

@jaywyawhare jaywyawhare force-pushed the fix/security-hardening branch from 03e316d to 1e5e928 Compare April 27, 2026 19:42
@jaywyawhare jaywyawhare merged commit d0ac85c into master Apr 27, 2026
5 checks passed
@jaywyawhare jaywyawhare deleted the fix/security-hardening branch April 27, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant