Skip to content

Make unroll_length schedulable#1833

Open
QuantuMope wants to merge 1 commit intopytorchfrom
PR/andrew/schedulable-unroll-length
Open

Make unroll_length schedulable#1833
QuantuMope wants to merge 1 commit intopytorchfrom
PR/andrew/schedulable-unroll-length

Conversation

@QuantuMope
Copy link
Copy Markdown
Contributor

@QuantuMope QuantuMope commented Mar 30, 2026

This PR allows for a scheduled unroll length if we are running synced off-policy RL training:

  1. async_unroll=False
  2. whole_replay_buffer_training=False

It also allows for a scheduled value of 0, which in turn skips unrolling to train from the replay buffer.

This allows us to "simulate" very diverse training strategies. E.g.,

unroll_length = StepScheduler("iterations", [
    # unroll to collect an "offline" dataset
    (1, int(initial_collect_steps / num_para_envs)),        
    # perform offline training iterations. no unroll 
    (offline_training_iters, 0),     
    # continue with online RL                                    
    (offline_training_iters + 1, desired_unroll_length)])

Codex cleverly makes a minimal change with full backward compatibility by adding the following code

    @property
    def unroll_length(self):
        return self._unroll_length()

    @unroll_length.setter
    def unroll_length(self, value):
        self._unroll_length = as_scheduler(value)

self.unroll_with_grad = unroll_with_grad
self.use_root_inputs_for_after_train_iter = use_root_inputs_for_after_train_iter
self.async_unroll = async_unroll
if not isinstance(self._unroll_length, ConstantScheduler):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConstantScheduler --> should check against a base class, e.g. Scheduler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants