Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/content/docs/configuration/config.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -587,7 +587,7 @@ fully_async:
```

- `fully_async.max_staleness_steps`: Maximum off-policy steps allowed. If a trajectory group is scheduled at step *i* and trained at step *j*, then `j - i <= max_staleness_steps`. Larger values increase throughput but also off-policy-ness.
- `fully_async.num_parallel_generation_workers`: Number of generation workers to spawn. Should be &gt;= `policy_mini_batch_size` and &lt;= `policy_mini_batch_size * (max_staleness_steps + 1)`.
- `fully_async.num_parallel_generation_workers`: Number of generation workers to spawn. Should be \>= `policy_mini_batch_size` and \<= `policy_mini_batch_size * (max_staleness_steps + 1)`.

## Generator Configuration

Expand Down
4 changes: 3 additions & 1 deletion skyrl-gym/skyrl_gym/envs/search/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,9 @@ def _is_done(self, action: str) -> bool:

def _validate_action(self, action: str):
stop_tags = ["</search>", "</answer>"]
action = action.rstrip("\n") # strip out any trailing newlines
# TODO (sumanthrh): This assertion should really be that the *last token* generated contains <answer>.
# The last token generated can have additional punctuation characters like periods, etc.
action = action.rstrip("\n").rstrip(".") # strip out any trailing newlines and periods
for tag in stop_tags:
if tag in action:
assert action.split(tag, 1)[1] == "", (
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ def get_test_actor_config() -> SkyRLTrainConfig:
cfg.generator.inference_engine.async_engine = True
cfg.generator.inference_engine.num_engines = 1
cfg.generator.inference_engine.run_engines_locally = True
# NOTE: We reduce the gpu memory used by vLLM because of the colocated tests
# that can OOM on L4s. For more details, see: https://github.com/NovaSky-AI/SkyRL/pull/1221
cfg.generator.inference_engine.gpu_memory_utilization = 0.7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve readability and maintainability, consider defining a constant for the magic number 0.7 at the module level, for example: VLLM_GPU_MEMORY_UTILIZATION_FOR_CI = 0.7. This makes the purpose of the value clearer and simplifies future modifications.

return cfg


Expand Down