With launch on May 13, worth a quick call on whether the eval harness (lc eval, src/lightcone/eval/, the Eval workflow) is part of what we want users to discover.
What ships if we do nothing
lc eval shows up in lc --help
src/lightcone/eval/ (six modules + four test files)
evals/tasks/snae/ — one eval task (Type Ia SN MAP fit against Union2.1)
.github/workflows/Eval runs on every PR
docs/contributing/testing.md and docs/skills/authoring.md reference it
The harness has its own loop-prompt machinery — single claude -p invocation with high max-turns, agent loops over outputs internally. Different shape than the ralph skill PR #86 establishes as the canonical loop substrate for agent-driven work.
Why this is a question
lc eval and the eval module aren't really where Lightcone is investing right now — the focus is paper reproduction, the ralph-loop substrate, ASTRA spec discipline, the agentic layer wrapping all of it. A user who finds lc eval in the CLI help, follows the doc links, and tries to use it would land somewhere we're not actively maintaining. That's a confusing discoverability hit on launch surface area.
Options
- Ship. Keep
lc eval discoverable; commit to maintaining the eval signal alongside the rest of the launch surface.
- Punt. Leave the code in place but hide it from launch surfaces — drop the doc references, remove from
lc --help, document as internal-only. Revisit post-launch.
- Retire. Remove
src/lightcone/eval/, the lc eval subcommand, the test files, the workflow, the doc references. Take the simplification. Git history is the recovery path if we want it back.
My instinct is (3) — the simplest launch surface communicates what Lightcone is focused on most clearly. But if lc eval is load-bearing for someone's workflow, or if there's a case for keeping eval signal alive through launch, that's a real argument for (1) or (2).
cc @EiffL — what's your read?
— Claude on behalf of Cail
With launch on May 13, worth a quick call on whether the eval harness (
lc eval,src/lightcone/eval/, the Eval workflow) is part of what we want users to discover.What ships if we do nothing
lc evalshows up inlc --helpsrc/lightcone/eval/(six modules + four test files)evals/tasks/snae/— one eval task (Type Ia SN MAP fit against Union2.1).github/workflows/Evalruns on every PRdocs/contributing/testing.mdanddocs/skills/authoring.mdreference itThe harness has its own loop-prompt machinery — single
claude -pinvocation with high max-turns, agent loops over outputs internally. Different shape than theralphskill PR #86 establishes as the canonical loop substrate for agent-driven work.Why this is a question
lc evaland the eval module aren't really where Lightcone is investing right now — the focus is paper reproduction, the ralph-loop substrate, ASTRA spec discipline, the agentic layer wrapping all of it. A user who findslc evalin the CLI help, follows the doc links, and tries to use it would land somewhere we're not actively maintaining. That's a confusing discoverability hit on launch surface area.Options
lc evaldiscoverable; commit to maintaining the eval signal alongside the rest of the launch surface.lc --help, document as internal-only. Revisit post-launch.src/lightcone/eval/, thelc evalsubcommand, the test files, the workflow, the doc references. Take the simplification. Git history is the recovery path if we want it back.My instinct is (3) — the simplest launch surface communicates what Lightcone is focused on most clearly. But if
lc evalis load-bearing for someone's workflow, or if there's a case for keeping eval signal alive through launch, that's a real argument for (1) or (2).cc @EiffL — what's your read?
— Claude on behalf of Cail