MHC-TP

Cluster immunopeptidomics peptides by their HLA/MHC binding motif and get a ranked table plus a standalone, interactive HTML report.

mhc-tp takes a GibbsCluster output folder, correlates each cluster's position-specific scoring matrix against a reference of HLA/MHC class I + II binding motifs (human & mouse), and reports the best allele match per cluster.

Tip

📖 Documentation · 🔬 Live example report · 📦 PyPI

Quick start

pip install mhc-tp
mhc-tp fetch -s human                                  # download reference motifs (once)
mhc-tp search <gibbscluster_output_dir> -s human -o results/

Open results/clust_result/mhc-tp-result.html in any browser — see what it looks like in the live example report.

Note

Requirements: Python 3.9–3.11. A virtual environment is recommended (python -m venv .venv && source .venv/bin/activate).

Install

From PyPI (recommended):

pip install mhc-tp

Or install editable from source

So that git pull updates the tool:

git clone https://github.com/PurcellLab/MHC-TP.git
cd MHC-TP
pip install -e .

One-liner without cloning: pip install "git+https://github.com/PurcellLab/MHC-TP.git"

Download the reference data (once)

The reference motifs are fetched from the GitHub release, not bundled:

mhc-tp fetch -s human     # or:  mouse  |  all

Run a search

mhc-tp search <gibbscluster_output_dir> -s human -o results/

<gibbscluster_output_dir> is a GibbsCluster run folder (it must contain a matrices/ subdirectory).

Outputs land in results/clust_result/:

file	what it is
`correlations.csv`	every cluster→allele match (`hla` = display name, `formatted` = raw key, `correlation` = PCC)
`mhc-tp-result.html`	standalone interactive report — open it in any browser

Options

flag	meaning	default
`-s, --species`	`human` or `mouse`	`human`
`-c, --class`	restrict the reference to MHC class `I`, `II`, or `all`	`all`
`-r, --reference`	path to a `<species>.parquet` (otherwise the fetched one is used)	auto
`-t, --threshold`	minimum Pearson correlation (PCC) to report	`0.70`
`--topNHits`	allele matches to keep per cluster	`3`
`--always-top-n`	keep each cluster's top-N even below threshold (flagged in the report)	off
`-o, --output`	output directory	`output`
`--threads`	max CPU threads (also `$MHC_TP_THREADS`)	`4`
`--no-html`	write only the CSV	off
`-l, --log`	also save the coloured session log	off

Run mhc-tp search --help for the full list.

Examples

# Class I only, keep the top 5 matches per cluster
mhc-tp search runs/sampleA -s human -c I --topNHits 5 -o results/

# Guarantee a top-3 for every cluster (weak matches tagged "below cutoff")
mhc-tp search runs/sampleA -s human --always-top-n -o results/

Important

By default a match must score ≥ --threshold, so a cluster can return fewer than --topNHits rows (or none). --always-top-n returns the best N regardless — the threshold then only annotates confidence and nothing is dropped.

For contributors / developers

Dev environment, tests, and docs (click to expand)

The project uses pixi for a reproducible dev environment (Python 3.11) and a src/ layout packaged with hatchling.

git clone https://github.com/PurcellLab/MHC-TP.git
cd MHC-TP
pixi install            # create the dev env from pixi.lock
pixi run dev-install    # editable-install the package (run once)

pixi run test           # pytest
pixi run lint           # ruff
pixi run fmt            # black

[!WARNING] Always run via pixi run … — a bare python may pick up a different interpreter without the pinned dependencies. CI enforces black --check, so run pixi run fmt before pushing.

Preview the docs site

pip install -e ".[docs]"
mkdocs serve            # live preview at http://127.0.0.1:8000
mkdocs build            # static site in ./site

Rebuilding the reference data (dev only)

End users never do this. The per-species parquets are built once from the NetMHCpan / NetMHCIIpan packs and uploaded to the release. Embedding the Seq2Logo reference logos (--with-logos) needs a separate Python 2.7 env and is slow:

mhc-tp build-ref <species> <classI_pack> <classII_pack> <out.parquet> --with-logos --workers 16

Layout

src/mhc_tp/
  cli.py            entry point (mhc-tp)
  engine/           numba correlation search
  refdata/          reference parquet read/write, fetch, schema
  report/           HTML report rendering (data, logos, templates)
  db/               DEV-ONLY reference-pack ingestion
  tui/              Rich console banner, logging, results table
tests/              pytest suite
docs/               MkDocs site

How it works

For each GibbsCluster motif, every reference allotype motif is scored by the Pearson correlation of their flattened position-weight matrices, computed only over the informative cells of the cluster motif. Per cluster the allotypes are ranked by PCC (1.0 = identical motif shape). Full method and formula: the API reference.

Citation

If you use MHC-TP in your work, please cite:

Munday PR, Krishna SSG, Fehring J, Croft NP, Purcell AW, Li C, Braun A. Immunolyser 2.0: An advanced computational pipeline for comprehensive analysis of immunopeptidomic data. Comput Struct Biotechnol J. 2025;29:296–304. doi:10.1016/j.csbj.2025.10.007. PMID: 41209766; PMCID: PMC12590289.

BibTeX

@article{Munday2025Immunolyser2,
  title   = {Immunolyser 2.0: An advanced computational pipeline for comprehensive analysis of immunopeptidomic data},
  author  = {Munday, Prithvi Raj and Krishna, Sanjay S. G. and Fehring, Joshua and Croft, Nathan P. and Purcell, Anthony W. and Li, Chen and Braun, Asolina},
  journal = {Computational and Structural Biotechnology Journal},
  volume  = {29},
  pages   = {296--304},
  year    = {2025},
  doi     = {10.1016/j.csbj.2025.10.007},
  pmid    = {41209766},
  pmcid   = {PMC12590289}
}

Name		Name	Last commit message	Last commit date
Latest commit History 248 Commits
.github/workflows		.github/workflows
.vscode		.vscode
assets		assets
data/ref_sources		data/ref_sources
docs		docs
example/clust_result		example/clust_result
scripts		scripts
src/mhc_tp		src/mhc_tp
tests		tests
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docs_hooks.py		docs_hooks.py
mkdocs.yml		mkdocs.yml
pixi.lock		pixi.lock
pixi.toml		pixi.toml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MHC-TP

Quick start

Install

Download the reference data (once)

Run a search

Options

Examples

For contributors / developers

Preview the docs site

Rebuilding the reference data (dev only)

Layout

How it works

Citation

About

Uh oh!

Releases 7

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MHC-TP

Quick start

Install

Download the reference data (once)

Run a search

Options

Examples

For contributors / developers

Preview the docs site

Rebuilding the reference data (dev only)

Layout

How it works

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages