Skip to content

JihongZ/AutoPsychDx

Repository files navigation

🧠 AutoPsychDx

One command. Three methods. Automatic clinical diagnosis from item-level response data.

How to Use Example Methods License

Python R Claude Code tmux pipx

You provide the scale. The agent runs cut-off, IRT, and DCM — and writes the report.

Given item-level response data from any psychological scale, the agent applies three complementary diagnostic methods and generates a structured markdown report with prevalence estimates, method comparisons, and plain-language clinical interpretation.

See details of data analysis in the preprint on OSF - AutoPsychDx: An LLM Agent Framework for Automated Psychometric Diagnosis Using Multi-Method Classification (Zhang, 2026).


🤔 Why This Tool?

Psychometric diagnosis requires expertise across multiple frameworks. Researchers must manually choose between sum-score cut-offs, IRT, and DCM — each with different software, assumptions, and outputs. Reconciling disagreements between methods takes additional effort, and generating readable reports requires even more.

What if an LLM agent could do all of this automatically?

  • 🚀 Runs all three methods — cut-off, IRT, and DCM in a single command
  • 📋 Validates inputs automatically — checks item IDs, response ranges, and missing data
  • 💬 Flags method disagreements — highlights ambiguous cases where methods diverge
  • 📊 Generates a full report — prevalence tables, item content, and clinical interpretation
  • 🔄 Works with any scale — instrument-agnostic, configured via a simple items.csv

✨ The Result?

You set up the project folder. The agent diagnoses, interprets, and reports.


📐 Diagnostic Methods

📏 Sum Score Cut-off

Cut-off

Sums item responses per person and compares to a validated clinical threshold.

Key parameter: cutoff (from items.csv)

Best for: Quick screening when a published cut-off exists (e.g. PHQ-9 ≥ 10, PCL-5 ≥ 33)

📈 Item Response Theory

IRT

Fits a Graded Response Model using mirt. Estimates latent trait θ with standard errors per person.

Key parameter: theta_cutoff (default: 0)

Best for: Scales with items of varying quality; when measurement uncertainty matters

🔬 Diagnostic Classification

DCM

Fits a log-linear cognitive diagnosis model using CDM::gdm. Returns posterior class membership probability per person.

Key parameter: prob_cutoff (default: 0.5)

Best for: When the construct is naturally categorical (present/absent)

A person is flagged as diagnosed if at least 2 of 3 methods agree (consensus diagnosis). Persons where only 1 method flags them are marked ambiguous in the report.


⚡ Quick Start

Requirements: Claude Code, tmux, R ≥ 4.0, Python ≥ 3.10.

Step 1 — Install pipx

Platform Command
macOS brew install pipx && pipx ensurepath
Linux python3 -m pip install --user pipx && python3 -m pipx ensurepath
Windows python -m pip install --user pipx && python -m pipx ensurepath

Step 2 — Install R

Download from https://cran.r-project.org or:

Platform Command
macOS brew install r
Linux (Debian/Ubuntu) sudo apt install r-base
Windows Installer from CRAN

Then install R packages:

install.packages(c("mirt", "CDM"))

Step 3 — Install the diagnosis command

git clone https://github.com/JihongZ/AutoPsychDx
cd AutoPsychDx
pipx install -e .   # use pipx, not pip

Verify:

diagnosis --help

🖥️ CLI Reference

Command Description
diagnosis compile <folder> Generate items.csv from responses.csv
diagnosis run <folder> Run the full diagnosis pipeline (spinner, blocks until done)
diagnosis run <folder> --clear Delete Output/ then re-run
diagnosis clean <folder> Remove generated Output/ directory
diagnosis clean <folder> --all Also remove items.csv (responses.csv is never deleted)
diagnosis attach <name> Attach to a running tmux session to watch live output
diagnosis ls List all active diagnosis sessions
diagnosis kill <name> Stop a running session
diagnosis version Show installed version

<folder> is the path to your project folder (e.g. Projects/PTSD_Forbes2018). <name> is the folder name only (e.g. PTSD_Forbes2018).

Both compile and run run the agent inside a tmux session in the background and show a spinner until finished. Use diagnosis attach <name> at any time to watch the agent output live.


🛠️ How to Use

The minimum setup is a single file: responses.csv (cleaned item response data). Run diagnosis compile to generate items.csv automatically — this adds item metadata that produces a richer, more interpretable report.

Step 1 — Prepare responses.csv

A CSV where each row is a person and each column is an item (column names = item IDs):

GAD1,GAD2,GAD3,GAD4,GAD5,GAD6,GAD7
0,1,0,2,1,0,1
1,2,1,3,2,1,2
...

Place this file directly in your project folder. If you need to extract items from a larger dataset, write a prepare_responses.R script (see below).

Step 2 — Generate items.csv (three options)

items.csv adds item metadata (full wording, scale name, validated cut-off) that makes the report more meaningful. There are three ways to create it — choose whichever fits your workflow:


Option 1 — diagnosis compile (quickest)

diagnosis compile Projects/your_study

The agent reads responses.csv, infers scale name, response range, and cut-off from the data, and writes items.csv automatically. Review the output — inferred values (especially scale name and cut-off) may need manual correction.


Option 2 — Script (prepare_responses.R or Python)

Write a script that builds both responses.csv and items.csv from your raw dataset. Run it once before diagnosis run. Best when raw data is messy (mixed demographic columns, wide format, or downloaded from an external source).

R — local file
# prepare_responses.R — extract GAD-7 items from a local wide-format dataset

# 1. Load raw data
raw <- read.csv("raw_data.csv")   # e.g. 500 rows × 80 columns (demographics + items)

# 2. Define items
item_ids   <- c("GAD1", "GAD2", "GAD3", "GAD4", "GAD5", "GAD6", "GAD7")
item_texts <- c(
  "Feeling nervous, anxious, or on edge",
  "Not being able to stop or control worrying",
  "Worrying too much about different things",
  "Trouble relaxing",
  "Being so restless that it's hard to sit still",
  "Becoming easily annoyed or irritable",
  "Feeling afraid as if something awful might happen"
)

# 3. Extract and write responses.csv
responses <- raw[, item_ids]
write.csv(responses, "responses.csv", row.names = FALSE)
message("Saved responses.csv: ", nrow(responses), " persons x ", ncol(responses), " items")

# 4. Write items.csv
items <- data.frame(
  item_id      = item_ids,
  item_text    = item_texts,
  scale        = "GAD-7",
  cutoff       = 10,
  response_min = 0,
  response_max = 3
)
write.csv(items, "items.csv", row.names = FALSE)
message("Saved items.csv: ", nrow(items), " items")
Rscript Projects/your_study/prepare_responses.R
R — download from the internet (e.g. OSF)
# prepare_responses.R — download PHQ-9 data from OSF and extract items

library(osfr)   # install.packages("osfr") if needed

# 1. Download raw data
osf_retrieve_file("https://osf.io/abc123") |>
  osf_download(path = ".", conflicts = "overwrite")
raw <- read.csv("raw_data.csv")

# 2. Define items
item_ids   <- paste0("PHQ", 1:9)
item_texts <- c(
  "Little interest or pleasure in doing things",
  "Feeling down, depressed, or hopeless",
  "Trouble falling or staying asleep, or sleeping too much",
  "Feeling tired or having little energy",
  "Poor appetite or overeating",
  "Feeling bad about yourself",
  "Trouble concentrating on things",
  "Moving or speaking slowly / being fidgety or restless",
  "Thoughts that you would be better off dead"
)

# 3. Extract and write responses.csv
responses <- raw[, item_ids]
write.csv(responses, "responses.csv", row.names = FALSE)
message("Saved responses.csv: ", nrow(responses), " persons x ", ncol(responses), " items")

# 4. Write items.csv
items <- data.frame(
  item_id      = item_ids,
  item_text    = item_texts,
  scale        = "PHQ-9",
  cutoff       = 10,
  response_min = 0,
  response_max = 3
)
write.csv(items, "items.csv", row.names = FALSE)
message("Saved items.csv: ", nrow(items), " items")
Rscript Projects/your_study/prepare_responses.R

Option 3 — Manual spreadsheet (most control)

Create items.csv directly in Excel, Google Sheets, or any spreadsheet editor. Save as CSV and place it in the project folder.

Required columns:

Column Description
item_id Unique ID matching column names in responses.csv (e.g. PCL1)
item_text Full item wording as shown to respondents
scale Scale name used to group items and label outputs (e.g. PCL-5)
cutoff Validated sum-score cut-off for this scale (repeat for all rows in the scale)
response_min Minimum response value (e.g. 0)
response_max Maximum response value (e.g. 4)
item_id,item_text,scale,cutoff,response_min,response_max
PCL1,Repeated disturbing and unwanted memories of the stressful experience,PCL-5,33,0,4
PCL2,Repeated disturbing dreams of the stressful experience,PCL-5,33,0,4

Step 3 — Run diagnosis

diagnosis run Projects/your_study

🏗️ How It Works

  responses.csv
       │
       ▼
  diagnosis compile <folder>          ← infers metadata, writes items.csv
       │
       ▼
  items.csv  +  responses.csv
       │
       ▼
  diagnosis run <folder>
       │
       ├─── Method A: Sum Score Cut-off ──► dx_cutoff (0/1)
       │
       ├─── Method B: IRT (Graded Response Model) ──► dx_irt (0/1) + θ ± SE
       │
       └─── Method C: DCM (CDM::gdm) ──► dx_dcm (0/1) + P(diagnosed)
                      │
                      ▼
          Consensus: diagnosed if ≥ 2/3 methods agree
                      │
                      ▼
       Output/
         [scale]_diagnosis.R           ← generated R script
         [scale]_diagnosis_results.csv ← person-level results
         [scale]_diagnosis_output.txt  ← raw report text
         diagnosis_report.md           ← full report with interpretation

Both commands run inside a tmux session in the background and block with a spinner until done. Use diagnosis attach <name> to watch live output at any time. Skill files in diagnosis/skills/ define the agent workflow — edit them to change behaviour for all projects.


📊 Example: Depression Screening (Forbes 2018)

Projects/PTSD_Forbes2018/ demonstrates the workflow using publicly available PHQ-9 data from Forbes et al. (2018).

diagnosis run Projects/PTSD_Forbes2018

prepare_responses.R downloads the data from OSF automatically on first run. All output is written to Projects/PTSD_Forbes2018/Output/.

Agent running in tmux

Diagnosis demo — tmux session showing agent output and PHQ-9 report

Generated diagnosis_report.md

Generated diagnosis_report.md showing PHQ-9 results for the Forbes 2018 sample


📊 Example: Anxiety Screening (Forbes 2018)

Projects/Anxiety_GAD_Forbes2018/ demonstrates the two-command workflow using publicly available GAD-7 data from Forbes et al. (2018). responses.csv is placed directly in the project folder — no prepare_responses.R needed.

Step 1 — Generate items.csv

❯ diagnosis compile Projects/Anxiety_GAD_Forbes2018
Found responses.csv — using it as raw data.
╭─────────────────────────── Generate items.csv ───────────────────────────╮
│ Project: .../Projects/Anxiety_GAD_Forbes2018                              │
│ Raw data: responses.csv                                                   │
╰───────────────────────────────────────────────────────────────────────────╯
  Session: diagnosis-anxiety-gad-forbes2018-compile
  Attach:  tmux attach -t diagnosis-anxiety-gad-forbes2018-compile
  Kill:    diagnosis kill Anxiety_GAD_Forbes2018

  ✓  Agent finished.
items.csv created.

Step 2 — Run diagnosis

❯ diagnosis run Projects/Anxiety_GAD_Forbes2018
╭──────────────────────── Psychometric Diagnosis Agent ────────────────────╮
│ Project: .../Projects/Anxiety_GAD_Forbes2018                              │
│ Version: 0.1.1                                                            │
╰───────────────────────────────────────────────────────────────────────────╯
  Session: diagnosis-anxiety-gad-forbes2018
  Attach:  tmux attach -t diagnosis-anxiety-gad-forbes2018
  Kill:    diagnosis kill Anxiety_GAD_Forbes2018

  ✓  Agent finished.

All output is written to Projects/Anxiety_GAD_Forbes2018/Output/.

Note

Current Limitations

  • Unidimensional only: All three methods currently assume a single latent construct. Multidimensional scales (e.g., instruments with subscales measuring distinct attributes) are not yet supported.
  • DCM: Only the general diagnostic model (CDM::gdm) is supported. LCDM, GDINA, DINA, and DINO are not yet implemented.
  • IRT: Limited to the Graded Response Model. Requires complete responses — handle missing data in prepare_responses.R before running.

📝 Changelog

v0.1.1

  • diagnosis compile — new command that generates items.csv automatically from responses.csv. The agent infers item IDs, scale name, response range, and cut-off from the data without manual setup.
  • diagnosis clean — new command to remove generated outputs. Deletes Output/ by default; --all also removes items.csv. responses.csv is never touched.
  • responses.csv as direct input — drop a response matrix into the project folder and run diagnosis compile directly. prepare_responses.R is now optional (only needed when extracting columns from a larger dataset).
  • Auto-exit tmux sessions — agent sessions now close automatically when the agent finishes. No keypress required.
  • Consistent CLI — both compile and run run in the background with a spinner and block until done. Use diagnosis attach <name> to watch live output at any time.

v0.1.0

  • Initial release: diagnosis run with cut-off, IRT (GRM), and DCM (GDM) methods
  • Consensus diagnosis (≥ 2/3 methods agree)
  • Structured markdown report with prevalence, method comparison, and clinical interpretation
  • tmux-based agent sessions with attach/kill/list commands

🗺️ Roadmap

Phase Feature Details
v0.2 Multidimensional DCMs
  • User-specified Q-matrix in items.csv (item × attribute mapping)
  • LCDM via CDM::gdm with multi-attribute Q-matrix
  • GDINA via GDINA::GDINA for flexible item–attribute interactions
  • DINA / DINO as constrained special cases
  • Attribute-level diagnostic profiles per person
v0.3 Multidimensional IRT
  • MIRT (multidimensional GRM) via mirt with exploratory or confirmatory specification
  • Subscale-level θ estimates with SEs
  • Per-subscale cut-off and consensus diagnosis
v0.4 Additional IRT models
  • 2PL for binary items
  • GPCM / PCM for alternative polytomous models
  • Automatic model selection based on item format
v0.5 Missing data & robustness
  • Full-information maximum likelihood (FIML) for IRT
  • Multiple imputation support
  • Longitudinal multi-timepoint comparison
Future Multi-backend & validation
  • Support for additional LLM backends (OpenAI, Gemini)
  • External validation against structured clinical interviews
  • Automated sensitivity analysis across cut-off thresholds

📖 Acknowledgements

  • Forbes et al. (2018) — PHQ-9 and GAD-7 community sample dataset used in the example project
  • mirt — R package for Item Response Theory (IRT) models
  • CDM — R package for Cognitive Diagnosis Models (DCM)
  • ClawTeam — architectural inspiration for the tmux-based agent CLI
  • Claude Code — LLM agent runtime

📄 License

MIT License — free to use, modify, and distribute. See LICENSE.


🗂️ Project Structure

.
├── diagnosis/                           # Python package (pipx install -e .)
│   ├── cli.py                           # Typer CLI: compile / run / clean / attach / kill / ls / version
│   ├── tmux.py                          # tmux session management
│   ├── skill_loader.py                  # loads bundled skill files
│   ├── __init__.py
│   ├── __main__.py
│   └── skills/
│       ├── diagnosis.md                 # agent workflow definition
│       ├── generate-items.md            # items.csv generation workflow
│       └── psychometric-diagnosis.md   # R function definitions
├── pyproject.toml                       # package metadata and entry point
├── .claude/
│   └── commands/                        # same skills as Claude Code slash commands
├── Projects/
│   └── PTSD_Forbes2018/                 # example project
│       ├── items.csv                    # item metadata
│       ├── prepare_responses.R          # downloads OSF data → responses.csv
│       └── Output/                      # auto-generated (git-ignored)
├── Screenshots/
│   ├── Diagnosis_PTSD.png
│   └── Diagnosis_Report.png
└── README.md

AutoPsychDx

Cut-off · IRT · DCM · One Command

If you find this project useful, please consider giving it a ⭐

About

LLM-assisted psychometric diagnosis: cut-off, IRT, and DCM via a Claude Code agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors