coppear

Co-appearance scoring for protein/marker panels across experimental runs.

coppear reads a CSV of assay selections grouped by run ID, builds a co-occurrence matrix, and scores every marker pair using two complementary metrics:

Eigenvector centrality — how central a pair is within the overall co-selection network (Perron–Frobenius leading eigenvector).
Log Stability-Weighted Score — log-ratio of observed co-selection frequency to the frequency expected by chance (analogous to pointwise mutual information).

Pairs that score high on both metrics tend to be consistently co-selected across runs and structurally central in the selection network — good candidates for stable, informative marker combinations.

Input format

A CSV file with at least two columns:

Column	Description
`runID`	Identifier for each experimental run or sample
`Assay`	Name of the marker/protein selected in that run

Each row is one marker selected in one run. Multiple rows with the same runID form a panel.

If the columns are not named runID and Assay, the script falls back to positional columns (index 1 and 2).

Installation

pip install numpy pandas matplotlib scipy

Usage

python coppear/coppear.py <input.csv> [--out-csv results.csv] [--out-plot scatter.png]

Argument	Default	Description
`csv`	(required)	Path to the input CSV file
`--out-csv`	`marker_pair_scores.csv`	Output CSV with all pair scores
`--out-plot`	`marker_scores_scatter.png`	Output scatter plot (PNG)

Example

python coppear/coppear.py data/selected_features_cancer_classification.csv \
    --out-csv /tmp/results.csv \
    --out-plot ~/Downloads/scatter.png

Output

CSV (`--out-csv`)

One row per co-appearing marker pair:

Column	Description
`marker_A`	First marker in the pair
`marker_B`	Second marker in the pair
`cooccurrence_count`	Number of runs in which both markers were selected together
`eig_centrality_pair`	Geometric mean of both markers' eigenvector centralities
`log_stability_score`	Log(observed co-selection / expected by chance)

Pairs that never co-appeared, or with undefined expected probability, are excluded.

Scatter plot (`--out-plot`)

A scatter of all co-appearing pairs with:

X-axis: Log Stability-Weighted Score (positive = more co-selected than chance)
Y-axis: Eigenvector Centrality (pair geometric mean)
Colour: Normalised co-occurrence count (plasma colormap)
Annotations: Top 5 pairs by combined rank labelled directly on the plot

Methods

Co-occurrence matrix

For each run, all pairs of markers in the same panel contribute +1 to the symmetric co-occurrence matrix C. Marginal selection probabilities π are estimated as the fraction of runs each marker appears in.

Eigenvector centrality

The leading eigenvector of C is computed via sparse power iteration (scipy.sparse.linalg.eigs). Entries are normalised to [0, 1]. The pair score is the geometric mean of the two marker centralities.

Log Stability-Weighted Score

log_stab(i, j) = log( P(i,j)_observed / (π_i × π_j) )

where P(i,j)_observed = C[i,j] / sum(C). Positive values mean the pair co-appears more often than independence would predict.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
coppear		coppear
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

coppear

Input format

Installation

Usage

Example

Output

CSV (`--out-csv`)

Scatter plot (`--out-plot`)

Methods

Co-occurrence matrix

Eigenvector centrality

Log Stability-Weighted Score

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

coppear

Input format

Installation

Usage

Example

Output

CSV (--out-csv)

Scatter plot (--out-plot)

Methods

Co-occurrence matrix

Eigenvector centrality

Log Stability-Weighted Score

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

CSV (`--out-csv`)

Scatter plot (`--out-plot`)

Packages