Add modular RLSSM simulator framework by krishnbera · Pull Request #278 · lnccbrown/ssm-simulators

krishnbera · 2026-05-19T23:29:57Z

Addresses #279

Summary

add a modular ssms.rl simulator framework that composes learning processes, task environments, and existing SSM decision processes
add Rescorla-Wagner learning rules, generic Bernoulli/Gaussian bandit environments, response-label to action-index mapping, and HSSM config export support
add RLSSM tests and a rendered MkDocs tutorial under Core Tutorials

Why

This fills the simulation-side gap for RLSSM workflows by letting users generate HSSM-compatible trial-wise RLSSM datasets without adding new Cython simulator code.

Validation

uv run pre-commit run --all-files
uv run pytest tests/rl tests/test_hssm_support.py tests/test_simulator.py -q --no-cov
MPLCONFIGDIR=/tmp/.mpl uv run --extra docs mkdocs build

Notes

The docs build currently emits pre-existing MkDocs/autorefs warnings unrelated to the new RLSSM tutorial.
Untracked local instruction files (AGENTS.md, CONTEXT.md) were left out of this PR.

Milestone 1, Commit 1: Defines the LearningProcess protocol (the handshake contract between learning and decision processes) and the first built-in implementation — RescorlaWagnerDeltaRule — which is numerically equivalent to HSSM's compute_v_trial_wise(). Includes 13 unit tests covering Q-value trajectories, drift ordering, HSSM numerical equivalence, and protocol compliance. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Milestone 1, Commit 2: Defines the TaskEnvironment protocol for reward generation, TwoArmedBandit (Bernoulli bandit with configurable per-arm probabilities), and TaskConfig convenience dataclass for common paradigms. Includes 14 unit tests covering reward statistics, reproducibility, input validation, protocol compliance, and TaskConfig builder. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Milestone 1, Commit 3: Defines RLSSMModelConfig — the structural model specification that resolves the handshake between learning process and decision process (SSM). Auto-derives list_params, bounds, and defaults from components. Includes validate() for config consistency checking and to_hssm_config_dict() for bridging to HSSM's RLSSMConfig. 13 tests cover auto-derivation, handshake validation, computed_param_mapping, TaskConfig auto-build, and HSSM dict contract. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Milestone 1, Commit 4: Implements the core RLSSMSimulator class that runs the trial-by-trial interleaved loop: compute SSM params from learning state, simulate one SSM trial, observe choice, generate reward, update learning. Reuses ssm-simulators' existing simulator() with n_samples=1 — all 40+ SSM models work as decision processes. Includes 15 tests covering DataFrame output shape, balanced panel, reproducibility, theta validation, edge cases, and omission handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Milestone 1, Commit 5: Adds the preset registry (register/get/list_rlssm_preset) with rlssm1 preset (RW delta rule + angle SSM + two-armed bandit). Wires up the ssms.rl public API with full __all__ exports and adds `from . import rl` to ssms/__init__.py. Fixes circular import in rl_simulator.py (OMISSION_SENTINEL). Includes 13 contract tests for HSSM compatibility (output dtypes, no NaNs, config dict schema) and registry smoke tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

review-notebook-app · 2026-05-19T23:30:02Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2026-05-19T23:35:53Z

Codecov Report

❌ Patch coverage is 94.35666% with 25 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
ssms/rl/config.py	92.56%	11 Missing ⚠️
ssms/rl/env.py	94.69%	7 Missing ⚠️
ssms/rl/learning.py	90.27%	7 Missing ⚠️

Flag	Coverage Δ
unittests	`92.71% <94.35%> (+0.41%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
ssms/rl/preset.py	`100.00% <100.00%> (ø)`
ssms/rl/simulator.py	`100.00% <100.00%> (ø)`
ssms/rl/env.py	`94.69% <94.69%> (ø)`
ssms/rl/learning.py	`90.27% <90.27%> (ø)`
ssms/rl/config.py	`92.56% <92.56%> (ø)`

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR introduces a new modular RLSSM simulation framework under ssms.rl, designed to interleave trial-wise reinforcement learning updates with existing SSM decision simulators and to export HSSM-compatible configuration/data.

Changes:

Added ssms.rl core components: ModelConfig, Simulator, task environments (bandits), learning rules (Rescorla–Wagner variants), and a preset registry.
Added comprehensive RLSSM-focused tests (learning, env, simulator behavior, and HSSM compatibility/contract checks).
Added an MkDocs tutorial entry for the new RLSSM simulator workflow.

Reviewed changes

Copilot reviewed 13 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`ssms/rl/config.py`	Defines the structural RLSSM configuration and HSSM export support.
`ssms/rl/simulator.py`	Implements the interleaved learning + SSM simulation loop and output formatting.
`ssms/rl/env.py`	Adds a task environment protocol plus Bernoulli/Gaussian bandit implementations and task registry.
`ssms/rl/learning.py`	Adds the learning process protocol and Rescorla–Wagner learning rules.
`ssms/rl/preset.py`	Adds an RLSSM preset registry and a built-in `rlssm1` preset.
`ssms/rl/__init__.py`	Exposes the public `ssms.rl` API surface.
`ssms/__init__.py`	Re-exports the `rl` module at the package top level.
`tests/rl/test_task_environment.py`	Tests bandit environment behavior, validation, and task config building.
`tests/rl/test_learning_process.py`	Tests RW learning rules’ numerical behavior and protocol compliance.
`tests/rl/test_rl_config.py`	Tests config auto-derivation, handshake validation, response mapping, and HSSM dict export.
`tests/rl/test_rl_simulator.py`	Tests simulation output schema, reproducibility, omission handling, and response/action mapping.
`tests/rl/test_hssm_compatibility.py`	Contract tests for HSSM consumability and preset registry behavior.
`tests/rl/__init__.py`	Initializes the RL test package.
`mkdocs.yml`	Adds the RLSSM tutorial notebook to the docs nav.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 13 out of 15 changed files in this pull request and generated 2 comments.

+        # list_params / params_default consistency
+        if self.list_params and self.params_default:
+            if len(self.list_params) != len(self.params_default):
+                raise ValueError(
+                    f"list_params length ({len(self.list_params)}) != "


+
+
+def _build_bandit(reward: str | None, options: dict) -> TaskEnvironment:
+    reward = reward or "bernoulli"


krishnbera and others added 10 commits May 13, 2026 14:45

Rename RLSSM simulator API

aca4eb5

Add RLSSM dual-alpha and Gaussian bandits

f992516

Add generic RL bandit response mapping

7fc4d37

Add RLSSM simulator docs tutorial

9ae73c3

Fix RLSSM mypy typing

ea189b2

krishnbera marked this pull request as ready for review May 19, 2026 23:36

Copilot AI review requested due to automatic review settings May 19, 2026 23:36

Copilot started reviewing on behalf of krishnbera May 19, 2026 23:36 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

Comment thread ssms/rl/env.py Outdated

Comment thread ssms/rl/env.py Outdated

Comment thread ssms/rl/config.py

Comment thread ssms/rl/config.py Outdated

Comment thread ssms/rl/simulator.py

Comment thread ssms/rl/simulator.py

Comment thread ssms/rl/learning.py

krishnbera self-assigned this May 20, 2026

krishnbera added the linear-ssm-simulators label May 20, 2026

Address RLSSM simulator review validation

53448d8

krishnbera requested a review from Copilot May 20, 2026 02:13

Copilot started reviewing on behalf of krishnbera May 20, 2026 02:14 View session

krishnbera requested review from AlexanderFengler, cpaniaguam and digicosmos86 May 20, 2026 02:14

Copilot AI reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add modular RLSSM simulator framework#278

Add modular RLSSM simulator framework#278
krishnbera wants to merge 11 commits into
mainfrom
feature/rlssm-simulator

krishnbera commented May 19, 2026 •

edited

Loading

Uh oh!

review-notebook-app Bot commented May 19, 2026

Uh oh!

codecov Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def _build_bandit(reward: str \| None, options: dict) -> TaskEnvironment:
		reward = reward or "bernoulli"

Conversation

krishnbera commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Notes

Uh oh!

review-notebook-app Bot commented May 19, 2026

Uh oh!

codecov Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

krishnbera commented May 19, 2026 •

edited

Loading

codecov Bot commented May 19, 2026 •

edited

Loading