Skip to content

fix: rename loop variable to avoid shadowing outer data in CriticWorker.train_step#375

Open
dubin555 wants to merge 1 commit intoalibaba:mainfrom
dubin555:oss-scout/verify-critic-train-data-shadow
Open

fix: rename loop variable to avoid shadowing outer data in CriticWorker.train_step#375
dubin555 wants to merge 1 commit intoalibaba:mainfrom
dubin555:oss-scout/verify-critic-train-data-shadow

Conversation

@dubin555
Copy link

Problem

In CriticWorker.train_step (roll/pipeline/base_worker.py) and ActorWorker.train_step (roll/pipeline/diffusion/reward_fl/actor_worker.py), the for-loop variable data shadows the outer data parameter:

data = data.to(current_platform.device_type)  # outer data = full batch on GPU
dataloader = data.make_iterator(...)

for batch_idx, data in tqdm(enumerate(dataloader)):  # shadows outer data
    self.strategy.train_step(batch=data, ...)

data.to("cpu")  # BUG: only moves last mini-batch, full batch stays on GPU

After the loop, data points to the last yielded mini-batch instead of the original full batch. The subsequent data.to("cpu") only moves the last mini-batch to CPU, leaving the original full batch allocated on GPU — causing a memory leak that accumulates over training steps.

Fix

Rename the loop variable from data to mini_batch so the outer data reference is preserved:

for batch_idx, mini_batch in tqdm(enumerate(dataloader)):
    self.strategy.train_step(batch=mini_batch, ...)

data.to("cpu")  # now correctly moves the full original batch

Files Changed

  • roll/pipeline/base_worker.py — CriticWorker.train_step
  • roll/pipeline/diffusion/reward_fl/actor_worker.py — ActorWorker.train_step
  • tests/test_train_step_variable_shadow.py — AST-based regression test

In CriticWorker.train_step and diffusion ActorWorker.train_step, the
for-loop used `data` as both the outer batch variable and the loop
iteration variable. After the loop, `data.to("cpu")` only moved the
last mini-batch to CPU instead of the full batch, causing GPU memory
to not be properly released.

Rename the loop variable from `data` to `mini_batch` so the outer
`data` variable remains intact for proper cleanup after the loop.
@CLAassistant
Copy link

CLAassistant commented Mar 14, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants