Chunk Elicitation Replication

This folder is a standalone replication bundle for the chunk-elicitation simulation and analysis pipeline. Simulation outputs are saved as local JSON files, and the analysis script reads those files to regenerate LaTeX tables and figures.

Repository Layout

src/db_ops/: local JSON database layer with a small PyMongo-like API.
scripts/01_Process_Benchmark.py: rebuilds processed benchmark JSON from raw benchmark files.
scripts/02_Run_Experiment_1.py: runs Experiment 1 into data/exp1/.
scripts/03_Run_Experiment_2.py: runs Experiment 2 into data/exp2/.
scripts/04_Run_Experiment_3.py: runs Experiment 3 into data/exp3/.
scripts/05_Run_Analysis.py: reads local JSON and writes tex/ artifacts.
data/raw/: duplicated raw benchmark inputs used by the benchmark processor.
data/benchmark/benchmarks.json: generated benchmark records.
data/exp*/simulations.json: simulation-level records.
data/exp*/simulation_sessions.json: session-level model outputs.
tex/tables/ and tex/figs/: analysis outputs for paper tables/figures.

Setup

From this folder:

uv sync
cp .env.example .env

Fill in OPENROUTER_API_KEY in .env if you plan to run new simulations. The included runner scripts default to OpenRouter model IDs.

Rebuild Benchmarks From Raw Files

The raw benchmark inputs are duplicated inside this replication folder under data/raw/. The benchmark processor does not read from the main project and does not use MongoDB. It rebuilds the processed benchmark JSON from scratch.

uv run python scripts/01_Process_Benchmark.py

This overwrites:

data/benchmark/benchmarks.json
data/benchmark/benchmark_manifest.json

The JSON shape matches the main project benchmark documents:

{
  "_id": "uuid",
  "game_type": "Dictator",
  "decisions": [[50], [0], [20]]
}

Running Simulations

First inspect each plan without calling any LLM APIs:

uv run python scripts/02_Run_Experiment_1.py --dry-run
uv run python scripts/03_Run_Experiment_2.py --dry-run
uv run python scripts/04_Run_Experiment_3.py --dry-run

Then run a script with confirmation skipped:

uv run python scripts/02_Run_Experiment_1.py --yes --max-workers 1

Each runner saves to its experiment folder. For example, Experiment 1 writes data/exp1/simulations.json and data/exp1/simulation_sessions.json.

The top-level output folder is controlled by --data-root:

uv run python scripts/03_Run_Experiment_2.py --data-root data --yes

Running Analysis

After rebuilding benchmarks and running simulations:

uv run python scripts/05_Run_Analysis.py

The analysis script reads data/exp1, data/exp2, data/exp3, and data/benchmark, then writes:

tex/tables/*.tex
tex/figs/*.png
tex/result.tex

The generated tex/result.tex is a compact article-style wrapper that inputs the tables and figures.

Local JSON Contract

Each experiment folder stores two collections:

simulations.json: one record per simulation configuration.
simulation_sessions.json: one record per LLM call/session.

The simulation record keeps references to session IDs:

{
  "_id": "simulation uuid",
  "phase_name": "phase_2",
  "simulation_config": {"game_type": "Dictator"},
  "instruction_config": {"explain_reasoning": true},
  "llm_config": {"model": "openai/gpt-5.2"},
  "simulation_sessions": ["session uuid"],
  "failed_sessions": [],
  "completed": true
}

The local database layer supports the small subset of queries used by the simulation and analysis code: find, find_one, dotted keys, $in, $or, and $exists.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
scripts		scripts
src		src
tex		tex
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chunk Elicitation Replication

Repository Layout

Setup

Rebuild Benchmarks From Raw Files

Running Simulations

Running Analysis

Local JSON Contract

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Chunk Elicitation Replication

Repository Layout

Setup

Rebuild Benchmarks From Raw Files

Running Simulations

Running Analysis

Local JSON Contract

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages