Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,18 @@ Then tell the agent what you have to start from — a research question
(`/lc-new`), existing code (`/lc-from-code`), or a paper to reproduce
(`/lc-from-paper`).

Experimental Codex support is also available:

```bash
lc init --agent codex my-analysis
cd my-analysis
codex
```

Codex projects use `AGENTS.md` and `.agents/skills/` instead of `.claude/`
and `CLAUDE.md`. The Codex skills are a smaller experimental bundle, not a
complete port of every Claude workflow.

→ [Full getting-started guide](https://docs.lightconeresearch.org/user/getting-started/)

## Skills
Expand Down
92 changes: 92 additions & 0 deletions codex/lightcone/skills/astra/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
name: astra
description: Reference for ASTRA project structure and astra.yaml authoring.
---

# ASTRA Reference

ASTRA projects describe an analysis in `astra.yaml`. Treat that file as the
durable specification for inputs, outputs, decisions, findings, sub-analyses,
containers, and recipes. The code implements the spec; it should not hide
extra analytical choices that are absent from `astra.yaml`.

## What Belongs In The Spec

- Inputs: data, files, external analysis outputs, or other material
dependencies.
- Outputs: concrete artifacts, metrics, tables, plots, reports, or datasets.
- Decisions: methodological choices where multiple defensible options could
affect the result.
- Recipes: commands that materialize outputs through `lc run`.
- Containers: the environment needed to run recipes reproducibly.

Keep output ids stable once results exist. If an output changes meaning, prefer
adding a new output or clearly updating the spec and rerunning the affected
recipes.

## Decisions

A decision is an analytical choice, not a general implementation detail.
Include choices such as thresholds, statistical methods, filtering criteria,
model families, binning, smoothing, priors, convergence criteria, and data
selection rules. Skip choices that should not change the scientific answer,
such as ordinary refactors, plotting style, file formats, or library choices
that produce equivalent results.

Every decision used by code should be parameterized. The recipe should pass it
with `{decisions.<id>}`, and the script should accept the corresponding command
line argument or config value. Do not leave decision values hardcoded in code
while claiming they are represented in `astra.yaml`.

## Outputs And Recipes

Each output should represent one concrete result. Avoid bundling unrelated
metrics or plots into one output just because one script can create them.

Recipes live under outputs and describe how to materialize the output:

```yaml
outputs:
- id: accuracy
type: metric
inputs: [training_data]
decisions: [model_family, threshold]
recipe:
command: >-
python src/evaluate.py
--data {inputs.training_data}
--model {decisions.model_family}
--threshold {decisions.threshold}
--output {output}
```

The output should be written under `{output}`. Do not write final artifacts to
untracked ad hoc paths and then copy them into `results/`.

## Spec-Code Invariant

`astra.yaml` and code must move together:

- New script argument or analytical parameter: add or update the matching
decision and recipe.
- New result: add an output and recipe.
- Changed input data path: update the input declaration.
- Removed or renamed output: update the spec, universes, code, and any
downstream references.
- Changed default behavior: update the baseline universe or default option.

## Validation And Checks

If available, run:

```bash
astra validate astra.yaml
lc status
lc run <changed_output>
lc verify
```

If `astra validate` is unavailable in the environment, say that explicitly and
continue with structural review plus `lc status` / `lc verify` where possible.
Do not ignore validation failures. Fix the spec or explain the remaining
blocker.
96 changes: 96 additions & 0 deletions codex/lightcone/skills/lc-cli/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
name: lc-cli
description: Reference for lightcone-cli execution commands and project checks.
---

# lightcone-cli Reference

Use `lc` as the execution surface for ASTRA projects. It ties together
recipes, containers, universes, manifests, and provenance checks. Scripts
can be run directly while debugging, but final analysis outputs must be
materialized through `lc run` so the result has a traceable manifest.

## Commands

```bash
lc init [DIR] # scaffold a project
lc run [OUTPUTS...] # materialize all or selected outputs
lc run output_id --universe NAME # materialize one output in one universe
lc status # show ok, missing, stale, alias, invalid
lc status --json # machine-readable status
lc verify # recompute hashes and validate provenance
lc build # build project containers
lc export wrroc # export a Workflow Run RO-Crate
```

`lc run` is quiet by default. If a run fails, inspect the actual error and fix
the cause; do not hide failures by writing placeholder outputs, weakening
recipes, or bypassing `lc`.

## Core Invariants

- `astra.yaml` is the source of truth for the analysis structure.
- The spec-code invariant must hold: when code, inputs, parameters, outputs,
or recipe commands change, update `astra.yaml` in the same change.
- Results under `results/<universe>/<output_id>/` are not hand-authored
deliverables. They are materialized outputs with `.lightcone-manifest.json`
provenance.
- Do not patch results in place to make status look clean. Update the spec or
code, rerun `lc run`, then verify.
- Do not mask execution failures. A failed run is information that should lead
to a concrete fix.

## Development Flow

1. Edit code and `astra.yaml` together.
2. If the ASTRA CLI is available, run:

```bash
astra validate astra.yaml
```

3. Check what changed:

```bash
lc status
```

4. Materialize the smallest useful target:

```bash
lc run output_id --universe baseline
```

5. After relevant outputs are materialized, run:

```bash
lc verify
```

For multi-output projects, prefer running one upstream output at a time while
debugging. It is usually easier to inspect a direct failure than to run the
whole DAG and debug from a downstream error.

## Status Meanings

- `ok`: output has a manifest that matches the current spec and inputs.
- `stale`: spec, code, decisions, or upstream data changed after the last run.
- `missing`: recipe exists, but no current manifest is present.
- `alias`: output is produced by another recipe or references another output.
- invalid or verification failures: inspect with `lc verify`, then rerun the
affected recipe after fixing the cause.

## Failure Handling

Common causes:

- Recipe command references `{decisions.foo}` or `{inputs.bar}` but the output
does not declare that decision or input.
- Script argument names do not match the recipe command.
- A script writes outside `{output}`, leaving the output directory empty or
incomplete.
- Existing files were copied into `results/` without a manifest.
- A result file was edited after materialization, causing a hash mismatch.

Fix the recipe, script, or spec. Then run `astra validate astra.yaml` if
available, `lc run ...`, `lc status`, and `lc verify`.
89 changes: 89 additions & 0 deletions codex/lightcone/skills/lc-from-code/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
name: lc-from-code
description: Guide for wrapping an existing codebase or script as an ASTRA project.
---

# Existing Code to ASTRA

Use this skill when a project already has scripts, notebooks, or data
processing code and needs an ASTRA specification around it. The goal is to
preserve the existing behavior while making inputs, outputs, decisions, and
recipes explicit in `astra.yaml`.

## Workflow

1. Inventory the existing code before editing it.
2. Identify real analytical outputs and the files or commands that create
them.
3. Identify inputs, dependencies, containers, and hardcoded analytical choices.
4. Draft or augment `astra.yaml`.
5. Parameterize code only as much as needed to match the spec.
6. Run through `lc run`, then inspect with `lc status` and `lc verify`.

## Scan First

For every script or notebook, determine:

- what it reads;
- what it writes;
- how it is invoked;
- which choices are hardcoded;
- which outputs are analytical results versus temporary files;
- which dependencies and environment assumptions it has.

Do not guess from filenames. Read the relevant code. If behavior is unclear,
say what is unclear and inspect the call sites, config files, or notebooks.

## Draft The Spec

Use `astra.yaml` as the source of truth. For each output, declare:

- `id` and `type`;
- upstream inputs;
- methodological decisions that parameterize it;
- a `recipe.command` that can run from the project root;
- a container at analysis or recipe level when needed.

Use current hardcoded behavior as the baseline default unless the user asks to
change it. If a baseline universe exists, keep it consistent with those
defaults.

## Parameterize Carefully

Make minimal code changes. Do not refactor, rename, or reorganize existing
logic unless it is necessary to make the recipe executable.

Common patterns:

- Add `argparse` flags for hardcoded decision values.
- Pass `{decisions.<id>}` and `{inputs.<id>}` from the recipe command.
- Pass `{output}` as an output directory and write artifacts inside it.
- Convert notebooks into small scripts only when notebook-only execution blocks
reproducibility.
- Add missing dependencies to `requirements.txt` or the project environment
file if imports require them.

Keep the spec-code invariant intact. If a recipe changes, update `astra.yaml`.
If a script gains a new parameter, represent it in the spec when it affects the
analysis.

## Run And Verify

After spec or code changes, run if available:

```bash
astra validate astra.yaml
lc status
lc run <changed_output>
lc status
lc verify
```

Do not copy old outputs into `results/` to make the project appear complete.
Do not edit generated result files directly. Do not suppress failing commands
with shell tricks that hide nonzero exits. If `lc run` fails, read the error,
fix the recipe/code/spec, and rerun.

When migration is complete, the baseline run should reproduce the original
behavior as closely as possible, and all final outputs should be traceable
through Lightcone manifests.
77 changes: 77 additions & 0 deletions codex/lightcone/skills/lc-new/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
name: lc-new
description: Guide for scoping a new ASTRA analysis from a research question.
---

# New ASTRA Analysis

Use this skill when the user asks to scope a new analysis from a research
question, loose idea, or desired scientific output. The goal is to turn the
conversation into a precise `astra.yaml` before implementation work starts.

Do not treat scoping as a coding task. First establish the analysis structure:
what question is being answered, what data can be used, what outputs count as
answers, and which methodological choices should be explicit decisions.

## Workflow

1. Clarify the research question in the user's terms.
2. Ask what a satisfactory answer would look like.
3. Identify allowed inputs and any data that must not influence the result.
4. Define outputs as concrete artifacts, metrics, tables, plots, or reports.
5. Identify methodological decisions that could affect the result.
6. Draft or update `astra.yaml`.
7. Generate or update a baseline universe when defaults are clear.
8. Validate the spec and hand off implementation only after the structure is
coherent.

## Analysis Structure

Prefer one output per concrete result. Do not make one broad output like
`performance_metrics` if the project actually produces accuracy, calibration,
and a ROC plot. Each independently interpreted artifact should be its own
output.

Split into sub-analyses only when stages have genuinely different inputs,
outputs, or scientific roles. If a training step and evaluation step together
produce one model assessment, they may belong in one analysis. If the stages
are independently meaningful products, sub-analyses may be appropriate.

## Decisions

Probe for decisions beyond obvious method choices:

- data inclusion and exclusion criteria;
- thresholds and quality cuts;
- statistical estimators and uncertainty methods;
- model families or algorithm choices;
- priors, smoothing, binning, and convergence criteria;
- operational definitions of measured quantities.

Skip implementation details that should not affect scientific interpretation.
When unsure, include the candidate decision in `astra.yaml` for review rather
than burying it in code.

## Spec-Code Invariant

During scoping, `astra.yaml` is the source of truth. Once implementation
starts, the code must follow it. If the user asks for code before the spec is
coherent, state the gap and finish the spec first.

After drafting or changing the spec, run if available:

```bash
astra validate astra.yaml
lc status
```

If outputs or recipes were implemented as part of the work, continue with:

```bash
lc run <changed_output>
lc verify
```

Do not produce final results by manual edits. Do not hide failed validation or
execution. Surface unresolved questions clearly and keep them in the spec or
project notes where appropriate.
21 changes: 21 additions & 0 deletions codex/lightcone/templates/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# AGENTS.md

This is an ASTRA analysis project orchestrated by `lightcone-cli`.

The source of truth is `astra.yaml`. Keep the spec, recipes, scripts, and
recorded outputs in sync. Do not edit or replace files under `results/`
without updating the relevant `astra.yaml` recipe or decisions and
rematerializing the output through `lc run`.

Use the `lc` commands for execution and checks:

```bash
lc run # materialize outputs in the default universe
lc run output_id # materialize one output
lc status # inspect missing, stale, and current outputs
lc verify # validate manifests and provenance integrity
```

When changing analysis code, update `astra.yaml` in the same change if the
inputs, outputs, decisions, parameters, or command recipes changed. Run
`lc status` and `lc verify` before considering the project state complete.
Loading
Loading