Conversation
- Add chemcot dataset to DATASETS registry using new DatasetConfig structure - Keep CoT quality guidelines from chemcot branch in prompts.yaml - Migrate chemcot from old dict-based interface to DatasetConfig - Remove legacy consolidation logic (datasets lib handles this)
Fallback to common miniconda paths when conda is not in PATH. Fixes B200 pod startup failure (conda: command not found). Made-with: Cursor
No more conda detection logic. Just set TRAINING_PYTHON in .env. Fallback to conda only if not set. Made-with: Cursor
start.sh now uses OPENHANDS_PYTHON for main.py execution, since the parent process may be in a different conda env. Made-with: Cursor
- Add agents/opencode/ with config.yaml, start.sh, README.md - Include opencode-rl pipeline code (pipeline/, runner_fsm/, benchmarks/) - Merge opencode-rl dependencies into autorl_bench requirements.txt - Remove separate venv requirement, share main environment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Sync opencode-rl runner_fsm with latest simplifications - Add smith benchmarks integration - Update opencompass configs and server with GPU support + error handling
- Document external repo architecture (opencode-rl as independent plugin) - Add setup instructions for cloning and configuring opencode-rl - Add architecture diagram showing RD-Agent ↔ opencode-rl interaction - Document OPENCODE_RL_ROOT for custom paths
- Add smith/ module for dynamic benchmark discovery from rl-smith - Add PerSampleEvaluator for per-sample scoring via vLLM - Update utils.py to support script-based data download for smith benchmarks - Update opencode agent config
- instructions.md: prohibit SFT, require RL (GRPO/PPO) for all benchmarks - remove agents/opencode/opencode-rl/ (runtime uses external OPENCODE_RL_ROOT) Made-with: Cursor
openai, httpx, python-dotenv, tenacity are for OpenCode agent's separate environment. Keep peft and pydantic as shared deps. Made-with: Cursor
- run.py: replace 2x nested 3-level try/except with shared _kill_process_group() using loop + specific exceptions - server.py: except Exception → except (RuntimeError, ValueError, OSError) - utils.py: except Exception → except requests.ConnectionError Made-with: Cursor
Extract from run.py into core/utils.py so other runners can also use it. Exported via core/__init__.py. Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Use relative paths, forbid cd outside workspace, ignore symlink targets. Made-with: Cursor
…CLI, remove unsupported args Made-with: Cursor
Ensures OpenCode-FSM-Runner writes outputs into the workspace prepared by AutoRL-Bench instead of creating its own runs/ directory. Made-with: Cursor
Ensures LLM agent bash calls (e.g. python3 -c "from trl import ...") resolve to the correct training environment, instead of relying on parent shell conda activation. Made-with: Cursor
…ode-rl Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Motivation and Context
How Has This Been Tested?
Screenshots of Test Results (if appropriate):
Types of changes
📚 Documentation preview 📚: https://RDAgent--1337.org.readthedocs.build/en/1337/