Turn a rough idea into a usable, verifiable pack.
One idea. One pack. Every claim backed by evidence.
Open this repo in your favorite AI coding tool, then say:
Build a pack for: "Should I invest in a coffee shop franchise?"
That's it. The agent reads the skill, executes 15 pipeline stages, and delivers a complete, evidence-backed pack.
IdeaClaw is agent-first β it runs inside AI coding tools as a skill. No API key needed. The IDE's built-in LLM is the execution engine.
| IDE | Entry Point | How to Use |
|---|---|---|
| Claude Code | .claude/skills/ideaclaw/SKILL.md |
Auto-detected when you open this repo |
| Codex (OpenAI) | AGENTS.md |
Auto-detected by Codex agents |
| Cursor / Windsurf / Others | CLAUDE.md |
Point the agent to this file |
| Standalone CLI (BYOK) | ideaclaw run |
Bring your own LLM API key |
git clone https://github.com/StartripAI/ideaClaw.git
cd ideaClaw
# Claude Code auto-reads .claude/skills/ β just prompt:
# "Build a pack for: Should I switch from React to Vue?"git clone https://github.com/StartripAI/ideaClaw.git
cd ideaClaw
pip install -e .
# Interactive login (supports OpenAI, Anthropic, DeepSeek, OpenRouter, Groq, Together)
ideaclaw login
# Or bring your own key
export OPENAI_API_KEY="sk-..."
# Run!
ideaclaw run --idea "Should I quit my job?" --auto-approveYou think it. IdeaClaw builds it.
Drop a rough idea β get back a complete, evidence-backed pack with:
- β Conclusion summary β the bottom line, in one paragraph
- β Reasoning tree β every claim traced to a source
- β Counterarguments β what could go wrong
- β Uncertainties β what we don't know yet
- β Action items β what to do next
- β Trust review β PQS score with 7-dimension rubric
- β Shareable document β Markdown / DOCX / JSON
Not just another AI writer. IdeaClaw tells you:
- π Which claims have strong evidence
β οΈ Which claims need your verification- π« Which claims lack sufficient evidence
- π Full audit trail β every decision is traceable
| Pack | Use Case |
|---|---|
decision |
Should I switch jobs? Is this purchase worth it? |
proposal |
Startup pitch, project application, partnership proposal |
comparison |
Product A vs B, City X vs Y, Framework comparison |
brief |
Complaint letter, business memo, learning report |
study |
Industry analysis, trend research, market overview |
IdeaClaw packs aren't one-size-fits-all. Each output is evaluated against real-world standards for its domain.
One engine Γ 122 YAML profiles = every scenario covered.
# Write an ICML paper β auto-detect cs_ml.icml profile
ideaclaw run --idea "Write an ICML paper on transformer efficiency"
# Specify a medical profile
ideaclaw run --idea "Systematic review of treatment X" --profile medical.systematic_review
# List all available profiles
ideaclaw profiles
ideaclaw profiles --domain medical| Domain | Profiles | Standards |
|---|---|---|
| cs_ml | 9 | ICML, NeurIPS, ICLR, ACL, AAAI, CVPR, KDD review forms |
| science | 8 | Nature, IEEE, ACM, PLOS ONE |
| medical | 12 | CONSORT (25 items), PRISMA (27 items), STROBE, CARE, ICH |
| business | 10 | McKinsey MECE + Pyramid, BCG SCQA, Amazon 6-Pager |
| finance | 8 | YC/Sequoia memo, CFA, GRI/SASB/TCFD |
| education | 10 | APA/MLA, Bloom's Taxonomy, Common App |
| grants | 8 | NIH 5-criteria (1-9), NSF Merit + Impacts, EU Horizon (0-5) |
| legal | 8 | IRAC, Bluebook, RAND policy brief |
| professional | 9 | Google SRE postmortem, ADR, IEEE-ISO |
| journalism | 6 | AP Stylebook, Pulitzer, IFCN fact-check |
| government | 6 | CIA ICD 203, NEPA, OECD RIA |
| marketing | 6 | E-E-A-T, AIDA |
| creative | 6 | Flesch-Kincaid, Final Draft |
| hr_ops | 6 | SMART goals, ADDIE |
| general | 10 | IdeaClaw's 5 pack types + 5 new |
Each pack receives a Pack Quality Score (PQS) from 0-10:
| Dimension | Description |
|---|---|
| Evidence Coverage | How well are claims backed by evidence? |
| Claim Accuracy | Are claims accurate and not overreaching? |
| Reasoning Quality | Is reasoning MECE, logical, and gap-free? |
| Actionability | Can the reader take immediate action? |
| Uncertainty Honesty | Are uncertainties explicitly flagged? |
| Structure & Clarity | Is the document scannable and well-organized? |
| Counterargument Depth | Are opposing viewpoints substantive? |
Dimension weights are customized per profile. E.g., medical.rct puts 25% weight on evidence, while business.mckinsey_memo puts 25% on actionability.
| Level | What It Checks |
|---|---|
| L1 | Structure compliance β all required sections present |
| L2 | Evidence traceability β claims linked to sources |
| L3 | PQS threshold β quality score meets profile's pass mark |
# Run benchmark on existing packs
ideaclaw benchmark --dir artifacts/Phase A: Idea Scoping Phase D: Synthesis & Decision
1. IDEA_INIT 9. EVIDENCE_SYNTHESIS
2. IDEA_DECOMPOSE 10. DECISION_TREE
11. COUNTERARGUMENT_GEN
Phase B: Source Discovery
3. SEARCH_STRATEGY Phase E: Pack Assembly
4. SOURCE_COLLECT 12. PACK_OUTLINE
5. SOURCE_SCREEN [gate] 13. PACK_DRAFT
6. EVIDENCE_EXTRACT 14. TRUST_REVIEW [gate]
Phase C: Evidence Verification Phase F: Export & Archive
7. EVIDENCE_GATE [gate] 15. EXPORT_PUBLISH
8. CLAIM_VERIFY
Gate stages (5, 7, 14) pause for human approval or auto-approve with --auto-approve.
Decision loops:
- Stage 10 β insufficient evidence β back to Stage 3 (re-search)
- Stage 14 β trust review fails β back to Stage 13 (revise draft)
# Run with auto-detect profile
ideaclaw run --idea "Compare React vs Vue" --auto-approve
# Specify profile explicitly
ideaclaw run --idea "NIH R01 grant proposal" --profile grants.nih_r01
# Authentication
ideaclaw login # Interactive login (6 providers)
ideaclaw whoami # Show current auth status
ideaclaw logout # Remove stored credentials
# Quality profiles
ideaclaw profiles # List all 122 profiles
ideaclaw profiles --domain cs_ml # Filter by domain
# Benchmark
ideaclaw benchmark --dir artifacts/
# Resume from checkpoint
ideaclaw resume --run-id ic-20260318-091500-abc123
# With custom config
ideaclaw run --config config.ideaclaw.yaml --idea "Your idea"| Provider | Setup |
|---|---|
| OpenAI | export OPENAI_API_KEY=sk-... or ideaclaw login |
| Anthropic | export ANTHROPIC_API_KEY=sk-ant-... |
| DeepSeek | export DEEPSEEK_API_KEY=sk-... |
| OpenRouter | export OPENROUTER_API_KEY=sk-or-... |
| Groq | export GROQ_API_KEY=gsk_... |
| Together | export TOGETHER_API_KEY=... |
| Custom | Set base_url in config or via ideaclaw login |
See config.ideaclaw.example.yaml for full configuration.
ideaClaw/
βββ ideaclaw/
β βββ cli.py # CLI entry point (8 subcommands)
β βββ config.py # YAML config loader
β βββ prompts.py # Prompt engine
β βββ pipeline/ # 15-stage pipeline runner
β βββ llm/ # LLM client + auth (BYOK + OAuth)
β βββ evidence/ # Evidence extraction & verification
β βββ quality/ # Quality profile system
β β βββ loader.py # YAML inheritance + auto-detect
β β βββ scorer.py # 7-dim PQS scorer
β β βββ reviewer.py # Structural review
β β βββ benchmark.py # L1/L2/L3 benchmark runner
β β βββ profiles/ # 136 YAML profile files
β β βββ cs_ml/ # 9 profiles
β β βββ medical/ # 12 profiles
β β βββ business/ # 10 profiles
β β βββ ... # 12 more domains
β βββ pack/ # Pack assembly + templates
β β βββ builder.py # Jinja2 template rendering
β β βββ trust_review.py # Profile-aware trust review
β β βββ templates/ # 6 Jinja2 templates
β βββ export/ # Markdown, DOCX, JSON export
β βββ knowledge/ # Knowledge base archiver
βββ .claude/skills/ # Claude Code skill entry
βββ AGENTS.md # Codex (OpenAI) entry
βββ CLAUDE.md # Universal agent entry
βββ config.ideaclaw.example.yaml
Inspired by:
- π¬ AutoResearchClaw (aiming-lab) β Pipeline architecture & agent skill pattern
- π§ autoresearch (Karpathy) β Minimalist agent philosophy
- π OpenRevise (StartripAI) β Evidence gate engine
MIT β see LICENSE for details.