Skip to content

Commit 0a9ffc1

Browse files
fix(posting): defensive-copy reused key buffers in ReadPostingList + addKeyToBatch
The 21million live-load systest failed with `~genre @filter(uid(F))` returning wrong films for some genres — corruption pattern was that one genre's reverse list contained another genre's films verbatim, while a third genre's reverse list was missing thousands of entries. Different genres were affected on each load, but always: legacy runMutation produced correct reverse counts; the per-predicate pipeline produced corrupt ones. Root cause: the pipeline's ProcessCount per-uid loop allocates one key buffer per ProcessCount call and mutates its trailing 8 bytes each iteration via `binary.BigEndian.PutUint64(dataKey[len-8:], uid)`. Two distinct sites then captured the slice header rather than the bytes: 1. ReadPostingList's `l.key = key` aliased the caller's buffer. saveInCache stores a copyList of the freshly-read list, and copyList sets `key: l.key` — also an alias. The cached list's key field therefore points at a buffer that the pipeline keeps mutating; by the time the async rollup path retrieves the cached list and runs `kv.Key = alloc.Copy(l.key)` to build the rolled-up KV, the bytes are whatever the LAST iteration left behind. Rollup then writes a BitCompletePosting with WithDiscard() to the wrong key, overwriting an unrelated reverse list with this list's (rolled-up) contents. 2. ReadPostingList's defer `IncrRollup.addKeyToBatch(key, ...)` appended the slice header verbatim to the rollup queue. Every queued entry from one ProcessCount goroutine ended up pointing at the same shared dataKey buffer; by the time the rollup goroutine processed the batch the bytes had collapsed to the final iteration's uid, redirecting many distinct rollup targets to the same key. Both sites fixed by taking ownership of the bytes (`append nil` / explicit `make+copy`). Legacy runMutation hits ReadPostingList too but allocates a fresh key per call, so it never aliased anything; the bug is only visible when a caller deliberately reuses one buffer across many uids the way ProcessCount does. Verified end-to-end on the systest/21million/live load with mutations-pipeline-threshold=1: a fresh load + sweep across all 764 Genre entities now reports `count(~genre) == count(forward edges)` for every genre. Pre-fix the same sweep showed 4 mismatched genres with thousands of stale or missing reverse entries; on a different load it showed Documentary missing 23,325 of its 31,370 reverse entries while Children's/Family had 1,435 stale Crime-Thriller entries. Unit tests in posting/ and worker/ still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent de90d97 commit 0a9ffc1

1 file changed

Lines changed: 22 additions & 2 deletions

File tree

posting/mvcc.go

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -137,9 +137,20 @@ func (ir *incrRollupi) rollUpKey(writer *TxnWriter, key []byte) error {
137137
// TODO: When the opRollup is not running the keys from keysPool of ir are dropped. Figure out some
138138
// way to handle that.
139139
func (ir *incrRollupi) addKeyToBatch(key []byte, priority int) {
140+
// Defensive copy. Callers (notably ReadPostingList's defer and the
141+
// per-predicate mutation pipeline's ProcessCount loop) reuse a single
142+
// key buffer across iterations and mutate its last 8 bytes per uid;
143+
// keeping a slice header that aliases that buffer means every queued
144+
// rollup target collapses to whatever uid was processed last. The
145+
// async rollup then writes a BitCompletePosting to the wrong key,
146+
// overwriting an unrelated posting list with WithDiscard. Take a copy
147+
// here so the rollup queue owns its bytes.
148+
keyCopy := make([]byte, len(key))
149+
copy(keyCopy, key)
150+
140151
rki := ir.priorityKeys[priority]
141152
batch := rki.keysPool.Get().(*[][]byte)
142-
*batch = append(*batch, key)
153+
*batch = append(*batch, keyCopy)
143154
if len(*batch) < 16 {
144155
rki.keysPool.Put(batch)
145156
return
@@ -678,7 +689,16 @@ func ReadPostingList(key []byte, it *badger.Iterator) (*List, error) {
678689
}
679690

680691
l := new(List)
681-
l.key = key
692+
// Copy the key bytes. The caller (mutation pipeline ProcessCount and
693+
// similar paths) reuses a single key buffer across loop iterations and
694+
// mutates its trailing 8 bytes to scan many uids; without copying here
695+
// the list's l.key field aliases that buffer, so by the time the list
696+
// is read back from the in-memory cache (saveInCache stashes a
697+
// copyList that shares the key alias) the bytes have changed. The
698+
// async rollup path then takes alloc.Copy(l.key) to build the rolled-
699+
// up KV's key and ends up writing a BitCompletePosting (with
700+
// WithDiscard) to a different list's key — corrupting that list.
701+
l.key = append([]byte(nil), key...)
682702
l.plist = new(pb.PostingList)
683703
l.mutationMap = newMutableLayer()
684704
l.minTs = 0

0 commit comments

Comments
 (0)