Skip to content

0bserver07/iclr-lit-builder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lit-builder

Pull ML conference paper lists, filter for the Sutro Group lens (energy-efficient training + broader training-efficiency), and produce an annotated, browsable literature base.

Live site: https://0bserver07.github.io/iclr-lit-builder/

Source: papercopilot/paperlists. Starts with ICLR 2026 (~20K records); generalizes to NeurIPS, ICML, etc.

Latest run — ICLR 2026

Count
Papers ingested 19,813
Keyword-filtered (sent to LLM) 4,842
Score 3 (directly Sutro-relevant) 40
Score 2 (relevant) 25
Score 1 (tangential) 134
Score 0 (rejected) 4,643
Markdown rendered 4,196 files

→ Browse the score-3 list

Pipeline

fetch  →  ingest  →  filter (keywords)  →  score (LLM)  →  deepen (on demand)  →  render

Each stage is a CLI subcommand and writes to a SQLite database (data/db/lit.sqlite). The markdown / MkDocs site is generated from the DB.

LLM provider — pick one

The scoring stage uses an LLM. Two providers are supported, controlled by LIT_PROVIDER. Override the model at any time with LIT_MODEL=<name>.

# Anthropic (default)
export LIT_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
# default model: claude-haiku-4-5-20251001
# override:  export LIT_MODEL=claude-sonnet-4-6

# Ollama Cloud
export LIT_PROVIDER=ollama
export OLLAMA_API_KEY=...
# default model: deepseek-v4-pro:cloud

# Ollama local (no key needed)
export LIT_PROVIDER=ollama
export OLLAMA_HOST=http://localhost:11434
export LIT_MODEL=llama3.1:8b

Solid Ollama Cloud models

Verified to work with the scoring prompt (lit score) and the deepen prompt (lit deepen). Swap with LIT_MODEL=<name>.

Model Notes
deepseek-v4-pro:cloud Default. Reasoning model; ~5s per paper at 200-token limit. Best price/quality.
deepseek-v4-flash:cloud Faster, lower latency, slightly less robust on edge cases.
gpt-oss:120b Strong general scorer. Slightly heavier than deepseek-v4-pro.
qwen3:235b-cloud Largest. Best for the deepen stage on borderline papers.
llama3.1:70b Solid baseline; available locally too.

Solid local models (Ollama, no API key)

LIT_MODEL=llama3.1:8b      # 4.7 GB, fast, decent
LIT_MODEL=qwen3:14b        # 8 GB, better reasoning
LIT_MODEL=deepseek-v4:7b   # 4 GB, distilled reasoning model

Quickstart

pip install -e .
# or with uv: uv sync && uv run lit ...

lit fetch  iclr2026
lit ingest iclr2026
lit filter iclr2026                       # keyword pre-filter
lit score  iclr2026 --limit 200           # LLM triage on survivors (0–3 + reason)
lit list   iclr2026 --min-score 2         # browse high-relevance
lit deepen iclr2026 <paper_id>            # structured digest on demand
lit render iclr2026                       # write markdown + mkdocs nav
lit serve                                 # local mkdocs preview

Real example output

Scoring 5 ICLR 2026 candidates on deepseek-v4-pro:cloud (Ollama Cloud), ~5s per paper:

score title reason
2 PersonalQ: Select, Quantize, and Serve Personalized Diffusion Quantization technique for personalized diffusion models that reduces inference memory, aligning with low-precision research.
2 Reassessing Layer Pruning in LLMs Layer pruning to reduce computation, directly addressing efficiency and model compression.
1 Toward Unifying Group Fairness Evaluation from a Sparsity Perspective References sparsity but only as a lens for fairness evaluation, not as a contribution to training efficiency.
1 Early Layer Readouts for Robust Knowledge Distillation Domain generalization via adaptive distillation, only tangential efficiency link.
0 Concept Alignment for Autonomous Distillation Robustness and bias mitigation, not energy-efficient training.

CLI as a tool

The CLI is designed to be called by other coding agents (Codex, Claude Code). Every command takes positional args, exits non-zero on error, and prints structured key=value output. See lit --help.

Layout

src/lit_builder/
  models.py        # shared dataclasses + SQLite DDL
  config.py        # paths, venue registry
  data/            # papercopilot fetch + SQLite ingest
  filter/          # keyword matcher
  score/           # LLM scorer + deepener (Anthropic / Ollama)
  render/          # markdown + mkdocs export
  cli/             # typer commands
configs/keywords.yaml   # editable keyword groups

Status

Stage iclr2026
fetch done — 19,813 raw records (93 MB)
ingest done — 19,813 in DB
filter done — 4,842 keyword candidates
score done — 4,842 / 4,842 LLM-scored via deepseek-v4-pro:cloud (40 at score 3, 25 at 2, 134 at 1, 4,643 at 0)
deepen implemented; on-demand per paper
render done — 4,196 markdown pages + index
publish live at https://0bserver07.github.io/iclr-lit-builder/

Tests

pip install pytest
PYTHONPATH=src python3 -m pytest tests -q       # 33 tests, all mocked LLM

About

ICLR/NeurIPS literature builder with keyword + LLM relevance scoring (Anthropic or Ollama). Targets the Sutro Group energy-efficient training lens.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages