lit-builder

Pull ML conference paper lists, filter for the Sutro Group lens (energy-efficient training + broader training-efficiency), and produce an annotated, browsable literature base.

Live site: https://0bserver07.github.io/iclr-lit-builder/

Source: papercopilot/paperlists. Starts with ICLR 2026 (~20K records); generalizes to NeurIPS, ICML, etc.

Latest run — ICLR 2026

	Count
Papers ingested	19,813
Keyword-filtered (sent to LLM)	4,842
Score 3 (directly Sutro-relevant)	40
Score 2 (relevant)	25
Score 1 (tangential)	134
Score 0 (rejected)	4,643
Markdown rendered	4,196 files

→ Browse the score-3 list

Pipeline

fetch  →  ingest  →  filter (keywords)  →  score (LLM)  →  deepen (on demand)  →  render

Each stage is a CLI subcommand and writes to a SQLite database (data/db/lit.sqlite). The markdown / MkDocs site is generated from the DB.

LLM provider — pick one

The scoring stage uses an LLM. Two providers are supported, controlled by LIT_PROVIDER. Override the model at any time with LIT_MODEL=<name>.

# Anthropic (default)
export LIT_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
# default model: claude-haiku-4-5-20251001
# override:  export LIT_MODEL=claude-sonnet-4-6

# Ollama Cloud
export LIT_PROVIDER=ollama
export OLLAMA_API_KEY=...
# default model: deepseek-v4-pro:cloud

# Ollama local (no key needed)
export LIT_PROVIDER=ollama
export OLLAMA_HOST=http://localhost:11434
export LIT_MODEL=llama3.1:8b

Solid Ollama Cloud models

Verified to work with the scoring prompt (lit score) and the deepen prompt (lit deepen). Swap with LIT_MODEL=<name>.

Model	Notes
`deepseek-v4-pro:cloud`	Default. Reasoning model; ~5s per paper at 200-token limit. Best price/quality.
`deepseek-v4-flash:cloud`	Faster, lower latency, slightly less robust on edge cases.
`gpt-oss:120b`	Strong general scorer. Slightly heavier than deepseek-v4-pro.
`qwen3:235b-cloud`	Largest. Best for the deepen stage on borderline papers.
`llama3.1:70b`	Solid baseline; available locally too.

Solid local models (Ollama, no API key)

LIT_MODEL=llama3.1:8b      # 4.7 GB, fast, decent
LIT_MODEL=qwen3:14b        # 8 GB, better reasoning
LIT_MODEL=deepseek-v4:7b   # 4 GB, distilled reasoning model

Quickstart

pip install -e .
# or with uv: uv sync && uv run lit ...

lit fetch  iclr2026
lit ingest iclr2026
lit filter iclr2026                       # keyword pre-filter
lit score  iclr2026 --limit 200           # LLM triage on survivors (0–3 + reason)
lit list   iclr2026 --min-score 2         # browse high-relevance
lit deepen iclr2026 <paper_id>            # structured digest on demand
lit render iclr2026                       # write markdown + mkdocs nav
lit serve                                 # local mkdocs preview

Real example output

Scoring 5 ICLR 2026 candidates on deepseek-v4-pro:cloud (Ollama Cloud), ~5s per paper:

score	title	reason
2	PersonalQ: Select, Quantize, and Serve Personalized Diffusion	Quantization technique for personalized diffusion models that reduces inference memory, aligning with low-precision research.
2	Reassessing Layer Pruning in LLMs	Layer pruning to reduce computation, directly addressing efficiency and model compression.
1	Toward Unifying Group Fairness Evaluation from a Sparsity Perspective	References sparsity but only as a lens for fairness evaluation, not as a contribution to training efficiency.
1	Early Layer Readouts for Robust Knowledge Distillation	Domain generalization via adaptive distillation, only tangential efficiency link.
0	Concept Alignment for Autonomous Distillation	Robustness and bias mitigation, not energy-efficient training.

CLI as a tool

The CLI is designed to be called by other coding agents (Codex, Claude Code). Every command takes positional args, exits non-zero on error, and prints structured key=value output. See lit --help.

Layout

src/lit_builder/
  models.py        # shared dataclasses + SQLite DDL
  config.py        # paths, venue registry
  data/            # papercopilot fetch + SQLite ingest
  filter/          # keyword matcher
  score/           # LLM scorer + deepener (Anthropic / Ollama)
  render/          # markdown + mkdocs export
  cli/             # typer commands
configs/keywords.yaml   # editable keyword groups

Status

Stage	iclr2026
fetch	done — 19,813 raw records (93 MB)
ingest	done — 19,813 in DB
filter	done — 4,842 keyword candidates
score	done — 4,842 / 4,842 LLM-scored via `deepseek-v4-pro:cloud` (40 at score 3, 25 at 2, 134 at 1, 4,643 at 0)
deepen	implemented; on-demand per paper
render	done — 4,196 markdown pages + index
publish	live at https://0bserver07.github.io/iclr-lit-builder/

Tests

pip install pytest
PYTHONPATH=src python3 -m pytest tests -q       # 33 tests, all mocked LLM

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
configs		configs
docs		docs
src/lit_builder		src/lit_builder
tests		tests
.gitignore		.gitignore
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lit-builder

Latest run — ICLR 2026

Pipeline

LLM provider — pick one

Solid Ollama Cloud models

Solid local models (Ollama, no API key)

Quickstart

Real example output

CLI as a tool

Layout

Status

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

lit-builder

Latest run — ICLR 2026

Pipeline

LLM provider — pick one

Solid Ollama Cloud models

Solid local models (Ollama, no API key)

Quickstart

Real example output

CLI as a tool

Layout

Status

Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages