Skip to content

suriyasureshok/LAM_Reproducible_ML_Workflows

Repository files navigation

Large Action Model Reproducible Scientific Workflows with R-LAM

Experimental evaluation repository demonstrating three execution paradigms for scientific workflows using the Breast Cancer Wisconsin dataset.

Overview

Large Action Models (LAMs) are LLM-driven systems that select and execute high-level actions ("load data," "train model") rather than generating code line-by-line.

R-LAM extends LAMs with reproducibility constraints:

  • Complete execution tracing
  • Deterministic replay without re-execution
  • Controlled workflow forking
  • Full provenance and auditability

Three Execution Paradigms

Feature Script Naive LAM R-LAM
Control Hard-coded LLM-driven LLM-driven
Tracing
Replay
Fork
Provenance Code only None Full DAG

1. Script-Based Pipeline (pipelines/naive_pipeline.py)

Traditional deterministic execution with fixed control flow. No LLM involvement.

python -m pipelines.naive_pipeline

2. Naive LAM Pipeline (pipelines/lam_pipeline.py)

LLM plans actions dynamically but executes directly without reproducibility infrastructure.

export OPENROUTER_API_KEY="your_key"
python -m pipelines.lam_pipeline

3. R-LAM Pipeline (pipelines/rlam_pipeline.py)

LLM-driven execution with full tracing, replay, and forking support via rlam framework.

export OPENROUTER_API_KEY="your_key"
python -m pipelines.rlam_pipeline

Repository Structure

├── actions/              # Pure action functions
│   ├── load.py          # load_dataset
│   ├── analyze.py       # analyze_data
│   ├── preprocess.py    # preprocess_data
│   ├── train.py         # train_model
│   └── evaluate.py      # evaluate_model
├── lam/                 # LAM components
│   ├── action_space.py  # Action registry
│   ├── planner.py       # LLM-based planner
│   └── state.py         # Workflow state
├── pipelines/           # Three execution modes
│   ├── naive_pipeline.py
│   ├── lam_pipeline.py
│   └── rlam_pipeline.py
├── experiments/         # Evaluation harness
│   ├── run_all.py       # Execute all pipelines
│   ├── metrics.py       # Metric computation
│   └── results_table.py # Result formatting
└── config.py            # Configuration

Action Space

Each action: (inputs, params) → outputs

Action Inputs Parameters Outputs
load_dataset - - X, y
analyze_data X - stats
preprocess_data X - X_processed
train_model X_processed, y C (regularization) model
evaluate_model model, X_processed, y - accuracy

Installation

pip install -r requirements.txt
# or with uv:
uv pip install -r requirements.txt

Running Experiments

# Individual pipelines
python -m pipelines.naive_pipeline
python -m pipelines.lam_pipeline     # Requires OPENROUTER_API_KEY
python -m pipelines.rlam_pipeline    # Requires OPENROUTER_API_KEY

# All experiments with metrics
python -m experiments.run_all

Key Design Choices

Actions over code generation: Semantic units that are traceable, replayable, and auditable by design.

Execution constraints: R-LAM enforces determinism and provenance at the execution layer, not by modifying the LLM.

Minimal scope: Small action space (5 actions), single dataset, single model—intentionally constrained for research clarity.

Research Context

This repository provides the experimental evaluation for:

"R-LAM: Reproducibility-Constrained Large Action Models for Scientific Workflow Automation"

Core contributions:

  1. Formal action schema for LAM-driven workflows
  2. Deterministic execution engine with provenance capture
  3. Replay and forking semantics for iterative experimentation
  4. Experimental validation on representative ML workflow

This is a research artifact, not a production system.

What This Is Not

  • A production ML platform
  • An autonomous discovery system
  • A claim of novel ML algorithms

Contribution: Reproducibility infrastructure for LAM-driven scientific automation, not the science itself.

Limitations

  • Simple LLM planner (not optimized)
  • Minimal error handling (for clarity)
  • Single dataset and model (Breast Cancer Wisconsin + Logistic Regression)
  • Small action space (5 actions)

These are intentional constraints for isolating execution semantics.

Citation

@article{rlam2026,
  title={R-LAM: Reproducibility-Constrained Large Action Models for Scientific Workflow Automation},
  author={Suriya Sureshkumar},
  year={2026}
}

Artifacts

License

MIT License

About

Experimental evaluation of Large Action Model execution paradigms for reproducible scientific workflows

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages