A FAIR Nextflow workflow and Streamlit app for integrating public HTS/qHTS assay data, psoriasis omics, proteomics, protein model features, structural evidence, and grounded evidence summaries for target and hit prioritization.
- GitHub repo: https://github.com/Caffeinated-Code/HTS_IL17_Psoriasis
- Published dashboard: https://caffeinated-code.github.io/HTS_IL17_Psoriasis/
- Local Streamlit app:
streamlit run app/streamlit_app.py
This is an employer-neutral portfolio project. It uses public data and compact example tables only and makes conservative claims: the workflow is a screening-to-biology prioritization workflow, not a clinical efficacy model.
HTS_IL17_Psoriasis models a realistic early-discovery question:
If a pathway-proximal screening assay nominates compounds or targets around ROR gamma / Th17 biology, which candidates also have psoriasis disease evidence, proteomics support, cell-type context, and interpretable protein-level features?
The project is intentionally transparent. It favors auditable scoring, provenance, and documented limitations over black-box ranking.
| Start here | Markdown | Browser-friendly HTML |
|---|---|---|
| Analysis walkthrough | docs/analysis_walkthrough.md | analysis_walkthrough.html |
| Project plan | PROJECT_PLAN.md | project_plan.html |
| Scientific review and roadmap | DIRECTOR_REVIEW.md | director_review.html |
| HTS/qHTS primer | docs/hts_primer.md | hts_primer.html |
| IL-17 psoriasis primer | docs/il17_psoriasis_primer.md | il17_psoriasis_primer.html |
| Proteomics primer | docs/proteomics_primer.md | proteomics_primer.html |
| Protein language models primer | docs/protein_language_models_primer.md | protein_language_models_primer.html |
| Structure prediction primer | docs/structure_prediction_primer.md | structure_prediction_primer.html |
| Evidence summary primer | docs/llm_evidence_primer.md | llm_evidence_primer.html |
| FAIR Nextflow and AWS primer | docs/fair_nextflow_aws_primer.md | fair_nextflow_aws_primer.html |
Plaque psoriasis is a strong public-data case study because the IL-23 / Th17 / IL-17 axis is well established, skin biopsy datasets are available, and disease activity can be connected to immune activation, keratinocyte response, and inflammatory proteomics.
The HTS component uses public ROR gamma qHTS-style assay data as a pathway-proximal screening analog. ROR gamma t is upstream of Th17 differentiation and IL-17 production. This is not a direct IL-17 peptide screen, and the workflow calls out that limitation.
Requirements:
- Nextflow
- Python 3
- Streamlit, pandas, and plotly for the app
Run the local demo:
nextflow run main.nf -profile testLaunch the app after the workflow completes:
streamlit run app/streamlit_app.pyThe app reads generated outputs from results/app_data/. If those files are missing, it falls back to bundled demo tables in demo_data/.
flowchart LR
A["PubChem/ChEMBL HTS-style assay data"] --> B["Screen QC and counterscreen filtering"]
C["Psoriasis transcriptomics"] --> D["Disease omics evidence"]
E["Psoriasis proteomics"] --> F["Protein-level validation"]
G["Single-cell context"] --> H["Cell-type annotation"]
I["UniProt sequences"] --> J["ESM/ProtBERT-style features"]
K["AlphaFold/ESMFold optional structures"] --> L["Structure confidence features"]
B --> M["Transparent candidate ranking"]
D --> M
F --> M
H --> M
J --> M
L --> M
M --> N["Grounded evidence cards"]
N --> O["HTML report and Streamlit app"]
The dashboard and workflow should be read as a prioritization exercise, not an efficacy claim:
- Start with public qHTS-style screening evidence around ROR gamma / Th17 biology.
- Penalize assay artifacts using counterscreen logic.
- Add psoriasis disease transcriptomics to check tissue relevance.
- Add proteomics to test whether RNA-supported candidates also have protein-level support.
- Add single-cell context to identify relevant immune or skin-cell compartments.
- Add protein model and structure features as supporting interpretation layers.
- Generate grounded evidence cards that explain the ranking, limitations, and next experiment.
Full walkthrough: docs/analysis_walkthrough.md or HTML version.
results/tables/candidate_rankings.tsvresults/tables/screen_qc.tsvresults/tables/disease_omics.tsvresults/tables/proteomics_validation.tsvresults/tables/singlecell_context.tsvresults/tables/protein_features.tsvresults/tables/structure_features.tsvresults/evidence_cards/evidence_cards.mdresults/reports/HTS_IL17_Psoriasis_report.htmlresults/provenance/run_provenance.jsonresults/app_data/tables used by Streamlit
The demo ships with compact cached example tables shaped like the public sources below. Full retrieval modules are documented as future work.
| Evidence layer | Public source | Why it is used |
|---|---|---|
| HTS/qHTS | PubChem AID 2604 ROR gamma transcriptional activity screen; PubChem AID 2546 VP16 counterscreen | Screening-style pathway-proximal assay data for Th17/IL-17 biology |
| Bioactivity context | ChEMBL ROR gamma / IL-17 pathway assay records | Curated drug discovery assay context |
| Disease transcriptomics | GSE54456 psoriasis lesional vs normal skin RNA-seq | Disease expression support |
| Proteomics | PRIDE PXD021673 psoriasis skin LC-MS/MS | Protein-level validation |
| Single-cell | GSE162183 psoriasis skin scRNA-seq | Cell-type context |
- ROR gamma qHTS is pathway-proximal: it is relevant to Th17 / IL-17 biology but is not a direct IL-17 peptide screen.
- Public screening data are small-molecule assays, used here as an HTS data analog.
- Protein language model features are descriptive unless validated in a predictive task.
- AlphaFold or ESMFold confidence supports structural plausibility, not binding proof.
- Evidence summaries are grounded in workflow outputs and should never create unsupported biological claims.
For a critical staff-scientist-style review and sequential improvement roadmap, see DIRECTOR_REVIEW.md.