diff --git a/code/SoS/enrichment/sldsc_enrichment.ipynb b/code/SoS/enrichment/sldsc_enrichment.ipynb
index d2145a19..e022ec8a 100644
--- a/code/SoS/enrichment/sldsc_enrichment.ipynb
+++ b/code/SoS/enrichment/sldsc_enrichment.ipynb
@@ -10,7 +10,7 @@
     "\n",
     "Minimal working-example driver for the S-LDSC functional-enrichment pipeline. The **Steps** section below gives one ready-to-run `sos run` command per workflow, using the toy inputs symlinked under `input/`.\n",
     "\n",
-    "> **Environment note.** Steps 1–2 (`make_annotation_files_ldscore`, `get_heritability`) wrap the external **polyfun** toolkit (`compute_ldscores.py`, `ldsc.py`, `munge_polyfun_sumstats.py`) and require pre-computed reference-panel files (baseline-LD scores, LD weights, `.frq`, and PLINK `.bed/.bim/.fam`). polyfun is **not installed in this environment** and the reference panel is not shipped with the toy example, so those two steps cannot be executed here; their commands are provided for use on a system where polyfun and a matching panel are available. Steps 3–4 (`postprocess`, `meta_subset`) use `pecotmr::sldsc_postprocessing_pipeline` (available here) and read the `.results`/`.log` files produced by Step 2.\n"
+    "> **Environment note.** Steps 1\u20132 (`make_annotation_files_ldscore`, `get_heritability`) wrap the external **polyfun** toolkit (`compute_ldscores.py`, `ldsc.py`, `munge_polyfun_sumstats.py`) and require pre-computed reference-panel files (baseline-LD scores, LD weights, `.frq`, and PLINK `.bed/.bim/.fam`). polyfun is **not installed in this environment** and the reference panel is not shipped with the toy example, so those two steps cannot be executed here; their commands are provided for use on a system where polyfun and a matching panel are available. Steps 3\u20134 (`postprocess`, `meta_subset`) use `pecotmr::sldsc_postprocessing_pipeline` (available here) and read the `.results`/`.log` files produced by Step 2.\n"
    ]
   },
   {
@@ -137,8 +137,8 @@
     "\n",
     "The pipeline has two computational layers:\n",
     "\n",
-    "- **Regression layer** — the S-LDSC regression itself, performed by the [polyfun](https://github.com/omerwe/polyfun) engine. We do not re-implement this.\n",
-    "- **Post-processing layer** — standardization, differential per-SNP heritability, binary/continuous detection, and random-effects meta-analysis across traits. Implemented in the [`pecotmr`](https://github.com/StatFunGen/pecotmr) R package (`R/sldsc_wrapper.R`).\n",
+    "- **Regression layer** \u2014 the S-LDSC regression itself, performed by the [polyfun](https://github.com/omerwe/polyfun) engine. We do not re-implement this.\n",
+    "- **Post-processing layer** \u2014 standardization, differential per-SNP heritability, binary/continuous detection, and random-effects meta-analysis across traits. Implemented in the [`pecotmr`](https://github.com/StatFunGen/pecotmr) R package (`R/sldsc_wrapper.R`).\n",
     "\n",
     "The notation below tags each modeling quantity as **(polyfun)** or **(pecotmr)**.\n",
     "\n",
@@ -151,7 +151,7 @@
     "\n",
     "#### Reference panel and MAF cutoff\n",
     "\n",
-    "All LD-derived quantities — partitioned LD scores for the 97 baseline annotations and for our $K$ target annotations, the LD-score-regression weights, allele frequencies, and the SNP set — are computed against our own LD reference panel. We do not mix in pre-computed quantities from external panels (e.g. 1000G); $M_{\\mathrm{ref}}$ throughout this notebook denotes the number of common SNPs in our panel.\n",
+    "All LD-derived quantities \u2014 partitioned LD scores for the 97 baseline annotations and for our $K$ target annotations, the LD-score-regression weights, allele frequencies, and the SNP set \u2014 are computed against our own LD reference panel. We do not mix in pre-computed quantities from external panels (e.g. 1000G); $M_{\\mathrm{ref}}$ throughout this notebook denotes the number of common SNPs in our panel.\n",
     "\n",
     "By default we restrict to MAF $> 5\\%$ per the sLDSC recommendation: rare-variant LD is unstable and HapMap3-style regression weights are common-variant by construction. The cutoff is exposed as the SoS parameter `maf_cutoff` (default $0.05$); the regression, the standardized $sd_C$, and $M_{\\mathrm{ref}}$ are all evaluated on the same MAF $>$ cutoff SNP set. If allele-frequency files are not available the pipeline fails; the user must explicitly set `maf_cutoff = 0` to opt out (not recommended).\n",
     "\n",
@@ -159,30 +159,30 @@
     "\n",
     "Solving Equation (2) jointly across annotations, with 200-block genomic jackknife for inference, is performed by polyfun's `ldsc.py`. From each polyfun run we obtain, per annotation:\n",
     "\n",
-    "- $\\tau_C$ and its standard error — **(polyfun)**.\n",
-    "- $\\pi^{h^2}_C$ and $\\pi^{M}_C$ — **(polyfun)**.\n",
-    "- $E_C = \\pi^{h^2}_C / \\pi^{M}_C$ and its standard error — **(polyfun)**.\n",
-    "- The p-value of the differential per-SNP heritability test (defined below) — **(polyfun)**, computed internally with the full coefficient covariance matrix.\n",
+    "- $\\tau_C$ and its standard error \u2014 **(polyfun)**.\n",
+    "- $\\pi^{h^2}_C$ and $\\pi^{M}_C$ \u2014 **(polyfun)**.\n",
+    "- $E_C = \\pi^{h^2}_C / \\pi^{M}_C$ and its standard error \u2014 **(polyfun)**.\n",
+    "- The p-value of the differential per-SNP heritability test (defined below) \u2014 **(polyfun)**, computed internally with the full coefficient covariance matrix.\n",
     "\n",
     "We also obtain, per run:\n",
     "\n",
-    "- The total trait heritability $h^2_g$ — **(polyfun)**.\n",
-    "- The 200-block jackknife delete-values of $\\tau_C$ — **(polyfun)**.\n",
+    "- The total trait heritability $h^2_g$ \u2014 **(polyfun)**.\n",
+    "- The 200-block jackknife delete-values of $\\tau_C$ \u2014 **(polyfun)**.\n",
     "\n",
     "#### Quantities from the post-processing layer (pecotmr)\n",
     "\n",
     "From the polyfun outputs above plus our reference panel, the post-processing layer computes:\n",
     "\n",
-    "- $sd_C$ — per-annotation standard deviation over MAF $>$ cutoff SNPs — **(pecotmr: `compute_sldsc_annot_sd`)**.\n",
-    "- $M_{\\mathrm{ref}}$ — reference SNP count at the MAF cutoff — **(pecotmr: `compute_sldsc_M_ref`)**.\n",
-    "- Whether each annotation is binary or continuous — **(pecotmr: `is_binary_sldsc_annot`)**.\n",
-    "- $\\tau^*_C$ point estimate and per-block $\\tau^*_C$ — **(pecotmr: `standardize_sldsc_trait`)**.\n",
-    "- EnrichStat point estimate and its standard error (formula below) — **(pecotmr: `standardize_sldsc_trait`)**.\n",
-    "- DerSimonian-Laird random-effects meta-analysis of $\\tau^*_C$, $E_C$, or EnrichStat across traits — **(pecotmr: `meta_sldsc_random`)**.\n",
+    "- $sd_C$ \u2014 per-annotation standard deviation over MAF $>$ cutoff SNPs \u2014 **(pecotmr: `compute_sldsc_annot_sd`)**.\n",
+    "- $M_{\\mathrm{ref}}$ \u2014 reference SNP count at the MAF cutoff \u2014 **(pecotmr: `compute_sldsc_M_ref`)**.\n",
+    "- Whether each annotation is binary or continuous \u2014 **(pecotmr: `is_binary_sldsc_annot`)**.\n",
+    "- $\\tau^*_C$ point estimate and per-block $\\tau^*_C$ \u2014 **(pecotmr: `standardize_sldsc_trait`)**.\n",
+    "- EnrichStat point estimate and its standard error (formula below) \u2014 **(pecotmr: `standardize_sldsc_trait`)**.\n",
+    "- DerSimonian-Laird random-effects meta-analysis of $\\tau^*_C$, $E_C$, or EnrichStat across traits \u2014 **(pecotmr: `meta_sldsc_random`)**.\n",
     "\n",
     "The top-level entry point `pecotmr::sldsc_postprocessing_pipeline` orchestrates all of the above.\n",
     "\n",
-    "#### Standardized tau ($\\tau^*$)  —  (pecotmr)\n",
+    "#### Standardized tau ($\\tau^*$)  \u2014  (pecotmr)\n",
     "\n",
     "$\\tau_C$ has units that depend on the scale of the annotation and on the total heritability of the trait, so raw $\\tau$ is not directly comparable across annotations or across traits. We compute the standardized version (Gazal et al. 2017)\n",
     "\n",
@@ -194,7 +194,7 @@
     "$$SE^{\\text{jackknife}}(\\tau^*_C) \\;=\\; \\sqrt{\\,\\tfrac{(B-1)^2}{B}\\, \\mathrm{Var}_b(\\tau^*_{C,(b)})\\,}$$\n",
     "with $B = 200$, used as the per-trait input to cross-trait meta-analysis.\n",
     "\n",
-    "#### Differential per-SNP heritability (\"EnrichStat\")  —  (polyfun + pecotmr)\n",
+    "#### Differential per-SNP heritability (\"EnrichStat\")  \u2014  (polyfun + pecotmr)\n",
     "\n",
     "To test whether the per-SNP heritability *inside* annotation $C$ differs from *outside* it (Finucane et al. 2015):\n",
     "\n",
@@ -206,7 +206,7 @@
     "\n",
     "This per-trait point + SE is the input to cross-trait meta-analysis.\n",
     "\n",
-    "#### Reporting: binary vs. continuous annotations  —  (pecotmr)\n",
+    "#### Reporting: binary vs. continuous annotations  \u2014  (pecotmr)\n",
     "\n",
     "The estimation machinery applies to both annotation types, but the *headline* quantity to report **within each type** differs.\n",
     "\n",
@@ -222,17 +222,17 @@
     ">\n",
     "> The pipeline always passes `--print-coefficients` to polyfun for this reason.\n",
     "\n",
-    "#### Cross-type comparison: always use $\\tau^*_C$  —  (pecotmr)\n",
+    "#### Cross-type comparison: always use $\\tau^*_C$  \u2014  (pecotmr)\n",
     "\n",
-    "For an apple-to-apple comparison **across binary and continuous annotations** — ranking annotations on a single axis, meta-analyzing a mixed set, or reporting a leaderboard that pools both types — use $\\tau^*_C$. The standardization in Gazal et al. (2017) was designed for exactly this purpose: $sd_C = \\sqrt{p(1-p)}$ for a binary annotation (where $p$ is the proportion in the category) and $sd_C = $ empirical standard deviation for a continuous annotation, so the resulting $\\tau^*_C$ is dimensionless and has the same interpretation in both cases — additive change in per-SNP heritability per 1 SD increase in the annotation, normalized by the average per-SNP heritability. $E_C$ does not have this property and must not be compared across types.\n",
+    "For an apple-to-apple comparison **across binary and continuous annotations** \u2014 ranking annotations on a single axis, meta-analyzing a mixed set, or reporting a leaderboard that pools both types \u2014 use $\\tau^*_C$. The standardization in Gazal et al. (2017) was designed for exactly this purpose: $sd_C = \\sqrt{p(1-p)}$ for a binary annotation (where $p$ is the proportion in the category) and $sd_C = $ empirical standard deviation for a continuous annotation, so the resulting $\\tau^*_C$ is dimensionless and has the same interpretation in both cases \u2014 additive change in per-SNP heritability per 1 SD increase in the annotation, normalized by the average per-SNP heritability. $E_C$ does not have this property and must not be compared across types.\n",
     "\n",
     "The pipeline emits both $E_C$ and $\\tau^*_C$ for every annotation, with the binary/continuous flag, so callers can pick the right column for the comparison they are making.\n",
     "\n",
-    "#### Joint analysis  —  (polyfun runs the regression; pecotmr standardizes both modes)\n",
+    "#### Joint analysis  \u2014  (polyfun runs the regression; pecotmr standardizes both modes)\n",
     "\n",
     "For **joint analysis** (multiple annotations fit together), both $\\tau$ and $E$ are conditional on the other annotations in the model. We report joint $\\tau^*_C$ as the independent contribution of annotation $C$ after controlling for the others. The annotation-prep step exposes two independent toggles, `compute_single` and `compute_joint` (both default `True`), so the user can produce the $N$ single-target outputs, the joint output, or both in one invocation. With both defaults the post-processing layer reads all $N+1$ regression outputs per trait and presents single + joint side-by-side. When the joint subset is decided after looking at single-target results (exploratory $\\rightarrow$ conditional workflow), the user runs the annotation-prep step a second time with `compute_single=False` on the curated subset.\n",
     "\n",
-    "### Meta-Analysis across Traits (Random Effects)  —  (pecotmr)\n",
+    "### Meta-Analysis across Traits (Random Effects)  \u2014  (pecotmr)\n",
     "\n",
     "DerSimonian-Laird random-effects meta-analysis of per-annotation estimates across traits, implemented in `pecotmr::meta_sldsc_random` (which delegates the numerics to `rmeta::meta.summaries(..., method = \"random\")`):\n",
     "\n",
@@ -449,7 +449,7 @@
     "to produce per-trait standardized tables and the default random-effects\n",
     "meta across all traits.\n",
     "\n",
-    "Use `--target-categories-label` (same order as `--target-categories`) to give the target annotations friendly names in the output — e.g. `--target-categories ANNOT_1_0 ANNOT_2_0 --target-categories-label quantile_eQTL eQTL` makes the `target` column read `quantile_eQTL` / `eQTL` instead of `ANNOT_1_0` / `ANNOT_2_0` (the original names are kept in `params$target_categories_orig`). Omit it to keep the polyfun `.results` names.\n"
+    "Use `--target-categories-label` (same order as `--target-categories`) to give the target annotations friendly names in the output \u2014 e.g. `--target-categories ANNOT_1_0 ANNOT_2_0 --target-categories-label quantile_eQTL eQTL` makes the `target` column read `quantile_eQTL` / `eQTL` instead of `ANNOT_1_0` / `ANNOT_2_0` (the original names are kept in `params$target_categories_orig`). Omit it to keep the polyfun `.results` names.\n"
    ]
   },
   {
@@ -541,10 +541,10 @@
     "\n",
     "Each workflow writes into its `--cwd`:\n",
     "\n",
-    "- **make_annotation_files_ldscore** — polyfun `.annot.gz` files plus per-annotation LD-score directories (`.l2.ldscore.{gz,parquet}`, `.l2.M`, `.l2.M_5_50`). One single-target directory per annotation, plus (when more than one annotation) a joint directory.\n",
-    "- **get_heritability** — per trait and per target directory, the S-LDSC regression outputs `<trait>.{results,log,part_delete}`. The `.results` `Category` column carries the annotation name with a `_<ref-ld-index>` suffix.\n",
-    "- **postprocess** — a single `<annotation_name>.sldsc_postprocess.rds` containing per-trait tables (Gazal-style tau*, EnrichStat with back-solved jackknife SE) and three DerSimonian–Laird random-effects meta tables (tau*, E, EnrichStat).\n",
-    "- **meta_subset** — a re-meta of the cached `.sldsc_postprocess.rds` over a user-defined trait subset (lightweight; no regression re-run).\n"
+    "- **make_annotation_files_ldscore** \u2014 polyfun `.annot.gz` files plus per-annotation LD-score directories (`.l2.ldscore.{gz,parquet}`, `.l2.M`, `.l2.M_5_50`). One single-target directory per annotation, plus (when more than one annotation) a joint directory.\n",
+    "- **get_heritability** \u2014 per trait and per target directory, the S-LDSC regression outputs `<trait>.{results,log,part_delete}`. The `.results` `Category` column carries the annotation name with a `_<ref-ld-index>` suffix.\n",
+    "- **postprocess** \u2014 a single `<annotation_name>.sldsc_postprocess.rds` containing per-trait tables (Gazal-style tau*, EnrichStat with back-solved jackknife SE) and three DerSimonian\u2013Laird random-effects meta tables (tau*, E, EnrichStat).\n",
+    "- **meta_subset** \u2014 a re-meta of the cached `.sldsc_postprocess.rds` over a user-defined trait subset (lightweight; no regression re-run).\n"
    ]
   },
   {
@@ -1092,7 +1092,7 @@
     "#     index 1 = the baseline file -> \"base_1\",\"Coding_UCSC_1\", ...  (the 97 baseline annots)\n",
     "# So in this pipeline the suffix is only ever 0 (target) or 1 (baseline); it would\n",
     "# continue 0,1,2,... only if you handed `ldsc.py --ref-ld-chr` more than two sources.\n",
-    "# (Why ANNOT_0 vs L2_0: see the [make_annotation_files_ldscore] header — ldsc.py's\n",
+    "# (Why ANNOT_0 vs L2_0: see the [make_annotation_files_ldscore] header \u2014 ldsc.py's\n",
     "#  \"n_annot == 1 -> column name 'L2'\" quirk vs compute_ldscores.py keeping the annot\n",
     "#  column name.)  [postprocess] auto-detects the target Category; if you instead pass\n",
     "# --target-categories, the names must match this column exactly.\n",
@@ -1222,7 +1222,7 @@
    "outputs": [],
    "source": [
     "[postprocess]\n",
-    "# Post-processing of polyfun outputs via pecotmr::sldsc_postprocessing_pipeline.\n",
+    "# Post-processing of polyfun outputs via pecotmr::sldscPostprocessingPipeline.\n",
     "# Reads .results / .log / .part_delete for all traits in `traits_file`, both\n",
     "# single-target and (when present) joint-target runs, computes Gazal-style\n",
     "# tau*, EnrichStat with back-solved jackknife SE, and runs the default\n",
@@ -1275,15 +1275,15 @@
     "        trait_joint_prefix <- setNames(rep(NA_character_, length(traits)), traits)\n",
     "    }\n",
     "\n",
-    "    res <- sldsc_postprocessing_pipeline(\n",
-    "        trait_single_prefixes = trait_single_prefixes,\n",
-    "        trait_joint_prefix    = trait_joint_prefix,\n",
-    "        target_anno_dir       = \"${target_anno_dir}\",\n",
-    "        frqfile_dir           = \"${frqfile_dir}\",\n",
-    "        plink_name            = \"${plink_name}\",\n",
-    "        maf_cutoff            = ${maf_cutoff},\n",
-    "        target_categories     = if (length(target_cats) > 0) target_cats else NULL,\n",
-    "        target_labels         = if (length(target_lab)  > 0) target_lab  else NULL\n",
+    "    res <- sldscPostprocessingPipeline(\n",
+    "        traitSinglePrefixes = trait_single_prefixes,\n",
+    "        traitJointPrefix    = trait_joint_prefix,\n",
+    "        targetAnnoDir       = \"${target_anno_dir}\",\n",
+    "        frqfileDir          = \"${frqfile_dir}\",\n",
+    "        plinkName           = \"${plink_name}\",\n",
+    "        mafCutoff           = ${maf_cutoff},\n",
+    "        targetCategories    = if (length(target_cats) > 0) target_cats else NULL,\n",
+    "        targetLabels        = if (length(target_lab)  > 0) target_lab  else NULL\n",
     "    )\n",
     "\n",
     "    saveRDS(res, \"${_output[0]}\")\n",
@@ -1324,15 +1324,15 @@
     "\n",
     "    subset_per_trait <- res$per_trait[subset_traits]\n",
     "\n",
-    "    # Map wide names (tau_star_single/joint) to bare names meta_sldsc_random expects.\n",
-    "    view_single <- pecotmr:::.sldsc_view_for_meta(subset_per_trait, \"single\")\n",
-    "    view_joint  <- pecotmr:::.sldsc_view_for_meta(subset_per_trait, \"joint\")\n",
+    "    # Map wide names (tau_star_single/joint) to bare names metaSldscRandom expects.\n",
+    "    view_single <- pecotmr:::.sldscViewForMeta(subset_per_trait, \"single\")\n",
+    "    view_joint  <- pecotmr:::.sldscViewForMeta(subset_per_trait, \"joint\")\n",
     "\n",
     "    out <- list(\n",
-    "        tau_star_single = setNames(lapply(target_cats, function(c) meta_sldsc_random(view_single, c, \"tau_star\")),   target_cats),\n",
-    "        tau_star_joint  = setNames(lapply(target_cats, function(c) meta_sldsc_random(view_joint,  c, \"tau_star\")),   target_cats),\n",
-    "        enrichment      = setNames(lapply(target_cats, function(c) meta_sldsc_random(view_single, c, \"enrichment\")), target_cats),\n",
-    "        enrichstat      = setNames(lapply(target_cats, function(c) meta_sldsc_random(view_single, c, \"enrichstat\")), target_cats)\n",
+    "        tau_star_single = setNames(lapply(target_cats, function(c) metaSldscRandom(view_single, c, \"tauStar\")),   target_cats),\n",
+    "        tau_star_joint  = setNames(lapply(target_cats, function(c) metaSldscRandom(view_joint,  c, \"tauStar\")),   target_cats),\n",
+    "        enrichment      = setNames(lapply(target_cats, function(c) metaSldscRandom(view_single, c, \"enrichment\")), target_cats),\n",
+    "        enrichstat      = setNames(lapply(target_cats, function(c) metaSldscRandom(view_single, c, \"enrichstat\")), target_cats)\n",
     "    )\n",
     "\n",
     "    saveRDS(out, \"${_output[0]}\")\n",
@@ -1380,4 +1380,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
\ No newline at end of file
diff --git a/code/SoS/mnm_analysis/multivariate_fine_mapping_vignette.ipynb b/code/SoS/mnm_analysis/multivariate_fine_mapping_vignette.ipynb
index 7bfafb9c..bcab898b 100644
--- a/code/SoS/mnm_analysis/multivariate_fine_mapping_vignette.ipynb
+++ b/code/SoS/mnm_analysis/multivariate_fine_mapping_vignette.ipynb
@@ -52,7 +52,7 @@
    "id": "25e3b8e3-4434-487c-b8ad-eb34905540dc",
    "metadata": {},
    "source": [
-    "- `{name}.{chr}_{gene_id}.multicontext_data.rds` — one RDS per gene, the multivariate (mvSuSiE / mr.mash) fine-mapping result that jointly models the effect across all conditions/contexts (e.g. `Ast`, `Exc`, `Inh`, `Mic`).\n\nInspecting the object shows the per-context posterior summaries:\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/data/test_mnm.chr11_ENSG00000073921.multicontext_data.rds\"))\nList of 8\n $ residual_Y       : num [1:150, 1:6] 0.1241 0.2981 -0.6054 0.0435 0.0388 ...\n  ..- attr(*, \"dimnames\")=List of 2\n  .. ..$ : chr [1:150] \"sample0\" \"sample1\" \"sample10\" \"sample100\" ...\n  .. ..$ : chr [1:6] \"Ast\" \"Exc\" \"Inh\" \"Mic\" ...\n $ residual_Y_scalar: num [1:6] 1 1 1 1 1 1\n $ dropped_sample   :List of 3\n  ..$ X    :List of 6\n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  ..$ Y    :List of 6\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  ..$ covar:List of 6\n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n $ X                : num [1:150, 1:9970] 0 0 1 1 1 0 0 1 1 1 ...\n  ..- attr(*, \"dimnames\")=List of 2\n  .. ..$ : chr [1:150] \"sample0\" \"sample1\" \"sample10\" \"sample100\" ...\n  .. ..$ : chr [1:9970] \"chr11:84957209:G:C\" \"chr11:84957879:A:G\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" ...\n $ maf              : Named num [1:9970] 0.33 0.327 0.103 0.343 0.343 ...\n  ..- attr(*, \"names\")= chr [1:9970] \"chr11:84957209:G:C\" \"chr11:84957879:A:G\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" ...\n $ chrom            : chr \"chr11\"\n $ grange           : chr [1:2] \"85957174\" \"86069881\"\n $ X_variance       : Named num [1:9970] 0.441 0.443 0.192 0.458 0.458 ...\n  ..- attr(*, \"names\")= chr [1:9970] \"chr11:84957209:G:C\" \"chr11:84957879:A:G\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" ...\n```\n\n* `*.multicontext_bvsr.rds`\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_fine_mapping/test_mnm.chr11_ENSG00000073921.multicontext_bvsr.rds\"),max.level=2)\nList of 1\n $ ENSG00000073921:List of 10\n  ..$ mrmash_fitted              :List of 14\n  .. ..- attr(*, \"class\")= chr [1:2] \"mr.mash\" \"list\"\n  ..$ reweighted_mixture_prior   :Classes 'MashInitializer', 'R6' <MashInitializer>\n  Public:\n    clone: function (deep = FALSE) \n    compute_prior_inv: function () \n    initialize: function (Ulist, grid, prior_weights = NULL, null_weight = 0, \n    n_component: active binding\n    n_condition: active binding\n    precompute_cov_matrices: function (d, algorithm = c(\"R\", \"cpp\")) \n    precomputed: active binding\n    prior_variance: active binding\n    remove_precomputed: function () \n    scale_prior_variance: function (sigma) \n  Private:\n    inv_mats: NULL\n    U: NULL\n    xU: list \n  ..$ reweighted_mixture_prior_cv:List of 5\n  ..$ mvsusie_fitted             :List of 28\n  .. ..- attr(*, \"class\")= chr \"mvsusie\"\n  ..$ variant_names              : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  ..$ analysis_script            : chr \"\\nlibrary(pecotmr)\\n\\nphenotype_files = c(\\\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/p\"| __truncated__\n  ..$ other_quantities           :List of 1\n  ..$ context_names              : chr [1:6] \"Ast\" \"Exc\" \"Inh\" \"Mic\" ...\n  ..$ total_time_elapsed         : 'proc_time' Named num [1:5] 104.38 0.653 105.828 0 0\n  .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  ..$ region_info                :List of 3\n```\n\n* `*.multicontext_twas_weights.rds`\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_twas_weights/test_mnm.chr11_ENSG00000073921.multicontext_twas_weights.rds\"), max.level=3)\nList of 1\n $ ENSG00000073921:List of 6\n  ..$ Ast_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Exc_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Inh_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Mic_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ OPC_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Oli_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n```"
+    "- `{name}.{chr}_{gene_id}.multicontext_data.rds` \u2014 one RDS per gene, the multivariate (mvSuSiE / mr.mash) fine-mapping result that jointly models the effect across all conditions/contexts (e.g. `Ast`, `Exc`, `Inh`, `Mic`).\n\nInspecting the object shows the per-context posterior summaries:\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/data/test_mnm.chr11_ENSG00000073921.multicontext_data.rds\"))\nList of 8\n $ residual_Y       : num [1:150, 1:6] 0.1241 0.2981 -0.6054 0.0435 0.0388 ...\n  ..- attr(*, \"dimnames\")=List of 2\n  .. ..$ : chr [1:150] \"sample0\" \"sample1\" \"sample10\" \"sample100\" ...\n  .. ..$ : chr [1:6] \"Ast\" \"Exc\" \"Inh\" \"Mic\" ...\n $ residual_Y_scalar: num [1:6] 1 1 1 1 1 1\n $ dropped_sample   :List of 3\n  ..$ X    :List of 6\n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  ..$ Y    :List of 6\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  .. ..$ : chr \"strand\"\n  ..$ covar:List of 6\n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n  .. ..$ : chr(0) \n $ X                : num [1:150, 1:9970] 0 0 1 1 1 0 0 1 1 1 ...\n  ..- attr(*, \"dimnames\")=List of 2\n  .. ..$ : chr [1:150] \"sample0\" \"sample1\" \"sample10\" \"sample100\" ...\n  .. ..$ : chr [1:9970] \"chr11:84957209:G:C\" \"chr11:84957879:A:G\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" ...\n $ maf              : Named num [1:9970] 0.33 0.327 0.103 0.343 0.343 ...\n  ..- attr(*, \"names\")= chr [1:9970] \"chr11:84957209:G:C\" \"chr11:84957879:A:G\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" ...\n $ chrom            : chr \"chr11\"\n $ grange           : chr [1:2] \"85957174\" \"86069881\"\n $ X_variance       : Named num [1:9970] 0.441 0.443 0.192 0.458 0.458 ...\n  ..- attr(*, \"names\")= chr [1:9970] \"chr11:84957209:G:C\" \"chr11:84957879:A:G\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" ...\n```\n\n* `*.multicontext_bvsr.rds`\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_fine_mapping/test_mnm.chr11_ENSG00000073921.multicontext_bvsr.rds\"),max.level=2)\nList of 1\n $ ENSG00000073921:List of 10\n  ..$ mrmash_fitted              :List of 14\n  .. ..- attr(*, \"class\")= chr [1:2] \"mr.mash\" \"list\"\n  ..$ reweighted_mixture_prior   :Classes 'MashInitializer', 'R6' <MashInitializer>\n  Public:\n    clone: function (deep = FALSE) \n    compute_prior_inv: function () \n    initialize: function (Ulist, grid, prior_weights = NULL, null_weight = 0, \n    n_component: active binding\n    n_condition: active binding\n    precompute_cov_matrices: function (d, algorithm = c(\"R\", \"cpp\")) \n    precomputed: active binding\n    prior_variance: active binding\n    remove_precomputed: function () \n    scale_prior_variance: function (sigma) \n  Private:\n    inv_mats: NULL\n    U: NULL\n    xU: list \n  ..$ reweighted_mixture_prior_cv:List of 5\n  ..$ mvsusie_fitted             :List of 28\n  .. ..- attr(*, \"class\")= chr \"mvsusie\"\n  ..$ variant_names              : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  ..$ analysis_script            : chr \"\\nlibrary(pecotmr)\\n\\nphenotype_files = c(\\\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/p\"| __truncated__\n  ..$ other_quantities           :List of 1\n  ..$ context_names              : chr [1:6] \"Ast\" \"Exc\" \"Inh\" \"Mic\" ...\n  ..$ total_time_elapsed         : 'proc_time' Named num [1:5] 104.38 0.653 105.828 0 0\n  .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  ..$ region_info                :List of 3\n```\n\n* `*.multicontext_twas_weights.rds`\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_twas_weights/test_mnm.chr11_ENSG00000073921.multicontext_twas_weights.rds\"), max.level=3)\nList of 1\n $ ENSG00000073921:List of 6\n  ..$ Ast_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Exc_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Inh_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Mic_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ OPC_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n  ..$ Oli_ENSG00000073921:List of 6\n  .. ..$ twas_weights      :List of 2\n  .. ..$ twas_predictions  :List of 2\n  .. ..$ variant_names     : chr [1:6454] \"chr11:84957209:G:C\" \"chr11:84958236:C:A\" \"chr11:84958954:T:C\" \"chr11:84959602:T:C\" ...\n  .. ..$ twas_cv_result    :List of 4\n  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02\n  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n  .. ..$ region_info       :List of 3\n```"
    ]
   },
   {
@@ -78,27 +78,27 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "sos run pipeline/mnm_regression.ipynb mnm \\\n",
-    "    --name test_mnm --cwd output/mnm \\\n",
-    "    --genoFile output/genotype_by_chrom/wgs.merged.plink_qc.genotype_by_chrom_files.txt \\\n",
-    "    --phenoFile output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \\\n",
-    "    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \\\n",
-    "    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \\\n",
-    "    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \\\n",
-    "    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \\\n",
-    "    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \\\n",
-    "    --covFile output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \\\n",
-    "    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \\\n",
-    "    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \\\n",
-    "    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \\\n",
-    "    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \\\n",
-    "    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \\\n",
-    "    --customized-association-windows reference_data/TAD/TADB_enhanced_cis.bed \\\n",
-    "    --region-name ENSG00000073921 --save-data --no-skip-twas-weights \\\n",
-    "    --phenotype-names Ast Exc Inh Mic OPC Oli \\\n",
-    "    --mixture_prior output/multivariate_mixture/MWE_ed_bovy.EE.prior.rds \\\n",
-    "    --max_cv_variants 5000 \\\n",
-    "\t--ld_reference_meta_file data/ld_meta_file_with_bim.tsv "
+    "# Step 1: build the QtlDataset RDS (one per study).\n",
+    "sos run pipeline/qtl_dataset.ipynb qtl_dataset_construct \\\n",
+    "    --cwd output/qtl_dataset \\\n",
+    "    --study test_mnm \\\n",
+    "    --genotype-prefix output/genotype_by_chrom/wgs.merged.plink_qc.chr1 \\\n",
+    "    --phenotype-manifest data/test_mnm.pheno_manifest.tsv\n",
+    "\n",
+    "# Step 2: per-region mvSuSiE joint fine-mapping.\n",
+    "sos run pipeline/multivariate_fine_mapping.ipynb multivariate_fine_mapping \\\n",
+    "    --cwd output/mnm \\\n",
+    "    --study test_mnm \\\n",
+    "    --qtl-dataset output/qtl_dataset/test_mnm.qtl_dataset.rds \\\n",
+    "    --regions chr1:1000000-2000000\n",
+    "\n",
+    "# Step 3 (optional): mr.mash + mvSuSiE TWAS weights on the same regions.\n",
+    "sos run pipeline/twas_weights.ipynb twas_weights \\\n",
+    "    --cwd output/mnm \\\n",
+    "    --study test_mnm \\\n",
+    "    --qtl-dataset output/qtl_dataset/test_mnm.qtl_dataset.rds \\\n",
+    "    --regions chr1:1000000-2000000 \\\n",
+    "    --methods mrmash,mvsusie"
    ]
   },
   {
diff --git a/code/SoS/mnm_analysis/multivariate_multigene_fine_mapping_vignette.ipynb b/code/SoS/mnm_analysis/multivariate_multigene_fine_mapping_vignette.ipynb
index f467e1b0..46e0ca3f 100644
--- a/code/SoS/mnm_analysis/multivariate_multigene_fine_mapping_vignette.ipynb
+++ b/code/SoS/mnm_analysis/multivariate_multigene_fine_mapping_vignette.ipynb
@@ -138,7 +138,7 @@
    "id": "c6344399",
    "metadata": {},
    "source": [
-    "- `{region_name}.{chr}_{region_id}.multigene_bvsr.rds` — one RDS per region, the fitted multivariate (mvSuSiE / mr.mash) Bayesian variable-selection model jointly fine-mapping all genes in that region.\n",
+    "- `{region_name}.{chr}_{region_id}.multigene_bvsr.rds` \u2014 one RDS per region, the fitted multivariate (mvSuSiE / mr.mash) Bayesian variable-selection model jointly fine-mapping all genes in that region.\n",
     "\n",
     "Inspecting the object shows the per-region fit (posterior effect estimates, PVE, credible sets and convergence trace):\n",
     "\n",
@@ -193,27 +193,20 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "sos run pipeline/mnm_regression.ipynb mnm_genes \\\n",
-    "    --name ROSMAP_Ast_mega_eQTL \\\n",
-    "    --genoFile data/mnm_genes/ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.11.bed \\\n",
-    "    --phenoFile data/mnm_genes/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.region_list.txt \\\n",
-    "    --covFile data/mnm_genes/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \\\n",
-    "    --customized-association-windows data/mnm_genes/extended_TADB.bed \\\n",
-    "    --phenotype-names Ast_mega_eQTL \\\n",
-    "    --max-cv-variants 5000 \\\n",
-    "    --ld_reference_meta_file data/ld_meta_file_with_bim.tsv \\\n",
-    "    --independent_variant_list data/mnm_genes/ld_pruned_variants.txt.gz \\\n",
-    "    --fine_mapping_meta data/mnm_genes/combined_data_updated.tsv \\\n",
-    "    --phenoIDFile data/mnm_genes/phenoIDFile_extended_TADB.bed \\\n",
-    "    --region-name chr11_77324757_82556425 \\\n",
-    "    --skip-analysis-pip-cutoff 0 \\\n",
-    "    --maf 0.01 \\\n",
-    "    --coverage 0.95 \\\n",
-    "    --pheno_id_map_file data/mnm_genes/pheno_id_map_file.txt \\\n",
-    "    --prior-canonical-matrices \\\n",
-    "    --twas-cv-folds 0 \\\n",
-    "    --trans-analysis \\\n",
-    "    --cwd output/mnm_regression/mnm_genes -s build"
+    "# Step 1: build the QtlDataset RDS for the joint-gene region.\n",
+    "sos run pipeline/qtl_dataset.ipynb qtl_dataset_construct \\\n",
+    "    --cwd output/qtl_dataset \\\n",
+    "    --study ROSMAP_Ast_mega_eQTL \\\n",
+    "    --genotype-prefix data/mnm_genes/ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.11 \\\n",
+    "    --phenotype-manifest data/mnm_genes/snuc_pseudo_bulk.Ast.mega.pheno_manifest.tsv\n",
+    "\n",
+    "# Step 2: per-region mvSuSiE joint fine-mapping. Use a region span that\n",
+    "# covers multiple genes (multi-trait joint mvSuSiE).\n",
+    "sos run pipeline/multivariate_fine_mapping.ipynb multivariate_fine_mapping \\\n",
+    "    --cwd output/mnm_genes \\\n",
+    "    --study ROSMAP_Ast_mega_eQTL \\\n",
+    "    --qtl-dataset output/qtl_dataset/ROSMAP_Ast_mega_eQTL.qtl_dataset.rds \\\n",
+    "    --regions chr11:65000000-67000000"
    ]
   },
   {
@@ -286,4 +279,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}
\ No newline at end of file
diff --git a/code/SoS/mnm_analysis/summary_stats_finemapping_vignette.ipynb b/code/SoS/mnm_analysis/summary_stats_finemapping_vignette.ipynb
index 6741f137..b6441d42 100644
--- a/code/SoS/mnm_analysis/summary_stats_finemapping_vignette.ipynb
+++ b/code/SoS/mnm_analysis/summary_stats_finemapping_vignette.ipynb
@@ -53,7 +53,7 @@
    "id": "6991377f-a6ef-45fe-ae78-20feb19fba9d",
    "metadata": {},
    "source": [
-    "- `{prefix}.sumstats.tsv.gz` — the harmonized, RAISS-imputed GWAS summary statistics (z-scores, effect sizes, allele frequencies and LD scores) used as the fine-mapping input.\n- `{name}.univariate_susie_rss.rds` — the SuSiE-RSS fine-mapping result computed from the summary statistics and the LD reference, one per region.\n\nThe summary-statistics table and fitted object look like:\n```\n> less RSS_QC_RAISS_imputed.chr11_84267999_86714492.AD_Bellenguez_2022.sumstats.tsv.gz | head\nchrom   pos     variant_id      A1      A2      Var.x   raiss_ld_score.x        raiss_R2.x      Var.y   raiss_ld_score.y        raiss_R2.y      pvalue  effect_allele_frequency odds_ratio  ci_lower        ci_upper        n_case  n_control       het_isq het_pvalue      beta    se      variants_id_original    Var     raiss_ld_score  z\n11      84267999        11:84267999:T:C C       T       0.131232630771865       35.033674731767 0.868767369228135                                                                  0.462675463930821\n11      84268181        11:84268181:CTG:C       C       CTG                             -1      Inf             0.6264  0.9965  1.039   0.891   1.211   83196   386481  0       0.4727                      11:84268181:CTG:C      -1      Inf     -0.486555697823303\n11      84268207        11:84268207:T:C C       T                               -1      Inf             0.6347  0.9641  0.99    0.948   1.033   85934   401577  52.6    0.0164     11:84268207:T:C  -1      Inf     0.475113122171946\n11      84268259        11:84268259:C:T T       C                               -1      Inf             0.5809  6e-04   0.875   0.544   1.407   69576   360279  69.8    0.06861    11:84268259:C:T  -1      Inf     -0.551980198019802\n11      84268265        11:84268265:C:A A       C                               -1      Inf             0.1207  0.035   0.966   0.924   1.009   85934   401577  0       0.6535     11:84268265:C:A  -1      Inf     -1.54910714285714\n11      84268309        11:84268309:T:TC        TC      T                               -1      Inf             0.425   0.9995  0.833   0.532   1.305   69576   360279  0       0.5788                      11:84268309:T:TC      -1      Inf     0.79763986013986\n11      84268394        11:84268394:A:T T       A                               -1      Inf             0.8437  0.3875  1.002   0.985   1.018   85934   401577  2.1     0.4233     11:84268394:A:T  -1      Inf     -0.202380952380952\n11      84268618        11:84268618:C:T T       C                               -1      Inf             0.629   0.0104  1.02    0.941   1.106   85502   400497  0       0.497      11:84268618:C:T  -1      Inf     0.483091787439614\n11      84268622        11:84268622:C:T T       C                               -1      Inf             0.6138  0.0356  1.011   0.968   1.056   85934   401577  51.5    0.01969    11:84268622:C:T  -1      Inf     0.504504504504505\n```\n\n* `*.univariate_susie_rss.rds`\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/rss_analysis/univariate_rss/RSS_QC_RAISS_imputed.chr11_84267999_86714492.univariate_susie_rss.rds\"))\nList of 1\n $ chr11_84267999_86714492:List of 1\n  ..$ AD_Bellenguez_2022:List of 2\n  .. ..$ susie_rss_RSS_QC_RAISS_imputed:List of 6\n  .. .. ..$ variant_names       : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. ..$ analysis_script     : chr \"library(pecotmr)\\nlibrary(dplyr)\\nlibrary(data.table)\\nskip_region = c(\\\"6:25000000-35000000\\\")\\nstudies = c(\\\"\"| __truncated__\n  .. .. ..$ sumstats            :List of 1\n  .. .. .. ..$ z: num [1:10779] 0.463 -0.487 0.475 -0.552 -1.549 ...\n  .. .. ..$ top_loci            :'data.frame':\t10 obs. of  6 variables:\n  .. .. .. ..$ variant_id      : chr [1:10] \"11:86139201:C:T\" \"11:86141937:G:A\" \"11:86144030:ATCT:A\" \"11:86144034:AC:A\" ...\n  .. .. .. ..$ z               : num [1:10] 12 12 11.6 11.6 11.6 ...\n  .. .. .. ..$ pip             : num [1:10] 0.2735 0.356 0.0339 0.0336 0.0339 ...\n  .. .. .. ..$ cs_coverage_0.95: num [1:10] 2 2 2 0 2 2 2 2 1 1\n  .. .. .. ..$ cs_coverage_0.7 : num [1:10] 2 2 0 0 0 0 0 2 1 1\n  .. .. .. ..$ cs_coverage_0.5 : num [1:10] 2 2 0 0 0 0 0 0 1 0\n  .. .. ..$ susie_result_trimmed:List of 9\n  .. .. .. ..$ pip           : num [1:10779] 1.63e-10 1.72e-10 1.64e-10 1.65e-10 2.45e-10 ...\n  .. .. .. ..$ sets          :List of 5\n  .. .. .. .. ..$ cs                :List of 2\n  .. .. .. .. .. ..$ L1: int [1:2] 7797 7798\n  .. .. .. .. .. ..$ L2: int [1:7] 7725 7735 7742 7744 7754 7760 7770\n  .. .. .. .. ..$ purity            :'data.frame':\t2 obs. of  3 variables:\n  .. .. .. .. .. ..$ min.abs.corr   : num [1:2] 0.994 0.954\n  .. .. .. .. .. ..$ mean.abs.corr  : num [1:2] 0.994 0.982\n  .. .. .. .. .. ..$ median.abs.corr: num [1:2] 0.994 0.99\n  .. .. .. .. ..$ cs_index          : int [1:2] 1 2\n  .. .. .. .. ..$ coverage          : num [1:2] 0.992 0.958\n  .. .. .. .. ..$ requested_coverage: num 0.95\n  .. .. .. ..$ cs_corr       : num [1:2, 1:2] 1 0.64 0.64 1\n  .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. ..$ sets_secondary:List of 2\n  .. .. .. .. ..$ coverage_0.7:List of 2\n  .. .. .. .. .. ..$ sets   :List of 5\n  .. .. .. .. .. .. ..$ cs                :List of 2\n  .. .. .. .. .. .. .. ..$ L1: int [1:2] 7797 7798\n  .. .. .. .. .. .. .. ..$ L2: int [1:3] 7725 7735 7770\n  .. .. .. .. .. .. ..$ purity            :'data.frame':\t2 obs. of  3 variables:\n  .. .. .. .. .. .. .. ..$ min.abs.corr   : num [1:2] 0.994 0.954\n  .. .. .. .. .. .. .. ..$ mean.abs.corr  : num [1:2] 0.994 0.968\n  .. .. .. .. .. .. .. ..$ median.abs.corr: num [1:2] 0.994 0.96\n  .. .. .. .. .. .. ..$ cs_index          : int [1:2] 1 2\n  .. .. .. .. .. .. ..$ coverage          : num [1:2] 0.992 0.73\n  .. .. .. .. .. .. ..$ requested_coverage: num 0.7\n  .. .. .. .. .. ..$ cs_corr: num [1:2, 1:2] 1 0.64 0.64 1\n  .. .. .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. ..$ coverage_0.5:List of 2\n  .. .. .. .. .. ..$ sets   :List of 5\n  .. .. .. .. .. .. ..$ cs                :List of 2\n  .. .. .. .. .. .. .. ..$ L1: int 7797\n  .. .. .. .. .. .. .. ..$ L2: int [1:2] 7725 7735\n  .. .. .. .. .. .. ..$ purity            :'data.frame':\t2 obs. of  3 variables:\n  .. .. .. .. .. .. .. ..$ min.abs.corr   : num [1:2] 1 0.96\n  .. .. .. .. .. .. .. ..$ mean.abs.corr  : num [1:2] 1 0.96\n  .. .. .. .. .. .. .. ..$ median.abs.corr: num [1:2] 1 0.96\n  .. .. .. .. .. .. ..$ cs_index          : int [1:2] 1 2\n  .. .. .. .. .. .. ..$ coverage          : num [1:2] 0.558 0.629\n  .. .. .. .. .. .. ..$ requested_coverage: num 0.5\n  .. .. .. .. .. ..$ cs_corr: num [1:2, 1:2] 1 0.64 0.64 1\n  .. .. .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. ..$ alpha         : num [1:2, 1:10779] 2.29e-16 1.63e-10 2.09e-16 1.72e-10 2.07e-16 ...\n  .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. ..$ : NULL\n  .. .. .. .. .. ..$ : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. .. ..$ lbf_variable  : num [1:2, 1:10779] -1.97 -1.77 -2.06 -1.72 -2.07 ...\n  .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. ..$ : NULL\n  .. .. .. .. .. ..$ : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. .. ..$ V             : num [1:2] 8.98e-05 5.29e-05\n  .. .. .. ..$ niter         : int 13\n  .. .. .. ..$ max_L         : int 5\n  .. .. .. ..- attr(*, \"class\")= chr \"susie\"\n  .. .. ..$ outlier_number      : int 0\n  .. ..$ rss_data_analyzed             :'data.frame':\t10779 obs. of  26 variables:\n  .. .. ..$ chrom                  : chr [1:10779] \"11\" \"11\" \"11\" \"11\" ...\n  .. .. ..$ pos                    : int [1:10779] 84267999 84268181 84268207 84268259 84268265 84268309 84268394 84268618 84268622 84268650 ...\n  .. .. ..$ variant_id             : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. ..$ A1                     : chr [1:10779] \"C\" \"C\" \"C\" \"T\" ...\n  .. .. ..$ A2                     : chr [1:10779] \"T\" \"CTG\" \"T\" \"C\" ...\n  .. .. ..$ Var.x                  : num [1:10779] 0.131 NA NA NA NA ...\n  .. .. ..$ raiss_ld_score.x       : num [1:10779] 35 NA NA NA NA ...\n  .. .. ..$ raiss_R2.x             : num [1:10779] 0.869 NA NA NA NA ...\n  .. .. ..$ Var.y                  : num [1:10779] NA -1 -1 -1 -1 -1 -1 -1 -1 -1 ...\n  .. .. ..$ raiss_ld_score.y       : num [1:10779] NA Inf Inf Inf Inf ...\n  .. .. ..$ raiss_R2.y             : num [1:10779] NA NA NA NA NA NA NA NA NA NA ...\n  .. .. ..$ pvalue                 : num [1:10779] NA 0.626 0.635 0.581 0.121 ...\n  .. .. ..$ effect_allele_frequency: num [1:10779] NA 0.9965 0.9641 0.0006 0.035 ...\n  .. .. ..$ odds_ratio             : num [1:10779] NA 1.039 0.99 0.875 0.966 ...\n  .. .. ..$ ci_lower               : num [1:10779] NA 0.891 0.948 0.544 0.924 0.532 0.985 0.941 0.968 0.982 ...\n  .. .. ..$ ci_upper               : num [1:10779] NA 1.21 1.03 1.41 1.01 ...\n  .. .. ..$ n_case                 : num [1:10779] NA 83196 85934 69576 85934 ...\n  .. .. ..$ n_control              : num [1:10779] NA 386481 401577 360279 401577 ...\n  .. .. ..$ het_isq                : num [1:10779] NA 0 52.6 69.8 0 0 2.1 0 51.5 0 ...\n  .. .. ..$ het_pvalue             : num [1:10779] NA 0.4727 0.0164 0.0686 0.6535 ...\n  .. .. ..$ beta                   : num [1:10779] NA NA NA NA NA NA NA NA NA NA ...\n  .. .. ..$ se                     : num [1:10779] NA NA NA NA NA NA NA NA NA NA ...\n  .. .. ..$ variants_id_original   : chr [1:10779] NA \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. ..$ Var                    : num [1:10779] NA -1 -1 -1 -1 -1 -1 -1 -1 -1 ...\n  .. .. ..$ raiss_ld_score         : num [1:10779] NA Inf Inf Inf Inf ...\n  .. .. ..$ z                      : num [1:10779] 0.463 -0.487 0.475 -0.552 -1.549 ..\n  \n```  "
+    "- `{prefix}.sumstats.tsv.gz` \u2014 the harmonized, RAISS-imputed GWAS summary statistics (z-scores, effect sizes, allele frequencies and LD scores) used as the fine-mapping input.\n- `{name}.univariate_susie_rss.rds` \u2014 the SuSiE-RSS fine-mapping result computed from the summary statistics and the LD reference, one per region.\n\nThe summary-statistics table and fitted object look like:\n```\n> less RSS_QC_RAISS_imputed.chr11_84267999_86714492.AD_Bellenguez_2022.sumstats.tsv.gz | head\nchrom   pos     variant_id      A1      A2      Var.x   raiss_ld_score.x        raiss_R2.x      Var.y   raiss_ld_score.y        raiss_R2.y      pvalue  effect_allele_frequency odds_ratio  ci_lower        ci_upper        n_case  n_control       het_isq het_pvalue      beta    se      variants_id_original    Var     raiss_ld_score  z\n11      84267999        11:84267999:T:C C       T       0.131232630771865       35.033674731767 0.868767369228135                                                                  0.462675463930821\n11      84268181        11:84268181:CTG:C       C       CTG                             -1      Inf             0.6264  0.9965  1.039   0.891   1.211   83196   386481  0       0.4727                      11:84268181:CTG:C      -1      Inf     -0.486555697823303\n11      84268207        11:84268207:T:C C       T                               -1      Inf             0.6347  0.9641  0.99    0.948   1.033   85934   401577  52.6    0.0164     11:84268207:T:C  -1      Inf     0.475113122171946\n11      84268259        11:84268259:C:T T       C                               -1      Inf             0.5809  6e-04   0.875   0.544   1.407   69576   360279  69.8    0.06861    11:84268259:C:T  -1      Inf     -0.551980198019802\n11      84268265        11:84268265:C:A A       C                               -1      Inf             0.1207  0.035   0.966   0.924   1.009   85934   401577  0       0.6535     11:84268265:C:A  -1      Inf     -1.54910714285714\n11      84268309        11:84268309:T:TC        TC      T                               -1      Inf             0.425   0.9995  0.833   0.532   1.305   69576   360279  0       0.5788                      11:84268309:T:TC      -1      Inf     0.79763986013986\n11      84268394        11:84268394:A:T T       A                               -1      Inf             0.8437  0.3875  1.002   0.985   1.018   85934   401577  2.1     0.4233     11:84268394:A:T  -1      Inf     -0.202380952380952\n11      84268618        11:84268618:C:T T       C                               -1      Inf             0.629   0.0104  1.02    0.941   1.106   85502   400497  0       0.497      11:84268618:C:T  -1      Inf     0.483091787439614\n11      84268622        11:84268622:C:T T       C                               -1      Inf             0.6138  0.0356  1.011   0.968   1.056   85934   401577  51.5    0.01969    11:84268622:C:T  -1      Inf     0.504504504504505\n```\n\n* `*.univariate_susie_rss.rds`\n```\n> str(readRDS(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/rss_analysis/univariate_rss/RSS_QC_RAISS_imputed.chr11_84267999_86714492.univariate_susie_rss.rds\"))\nList of 1\n $ chr11_84267999_86714492:List of 1\n  ..$ AD_Bellenguez_2022:List of 2\n  .. ..$ susie_rss_RSS_QC_RAISS_imputed:List of 6\n  .. .. ..$ variant_names       : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. ..$ analysis_script     : chr \"library(pecotmr)\\nlibrary(dplyr)\\nlibrary(data.table)\\nskip_region = c(\\\"6:25000000-35000000\\\")\\nstudies = c(\\\"\"| __truncated__\n  .. .. ..$ sumstats            :List of 1\n  .. .. .. ..$ z: num [1:10779] 0.463 -0.487 0.475 -0.552 -1.549 ...\n  .. .. ..$ top_loci            :'data.frame':\t10 obs. of  6 variables:\n  .. .. .. ..$ variant_id      : chr [1:10] \"11:86139201:C:T\" \"11:86141937:G:A\" \"11:86144030:ATCT:A\" \"11:86144034:AC:A\" ...\n  .. .. .. ..$ z               : num [1:10] 12 12 11.6 11.6 11.6 ...\n  .. .. .. ..$ pip             : num [1:10] 0.2735 0.356 0.0339 0.0336 0.0339 ...\n  .. .. .. ..$ cs_coverage_0.95: num [1:10] 2 2 2 0 2 2 2 2 1 1\n  .. .. .. ..$ cs_coverage_0.7 : num [1:10] 2 2 0 0 0 0 0 2 1 1\n  .. .. .. ..$ cs_coverage_0.5 : num [1:10] 2 2 0 0 0 0 0 0 1 0\n  .. .. ..$ susie_result_trimmed:List of 9\n  .. .. .. ..$ pip           : num [1:10779] 1.63e-10 1.72e-10 1.64e-10 1.65e-10 2.45e-10 ...\n  .. .. .. ..$ sets          :List of 5\n  .. .. .. .. ..$ cs                :List of 2\n  .. .. .. .. .. ..$ L1: int [1:2] 7797 7798\n  .. .. .. .. .. ..$ L2: int [1:7] 7725 7735 7742 7744 7754 7760 7770\n  .. .. .. .. ..$ purity            :'data.frame':\t2 obs. of  3 variables:\n  .. .. .. .. .. ..$ min.abs.corr   : num [1:2] 0.994 0.954\n  .. .. .. .. .. ..$ mean.abs.corr  : num [1:2] 0.994 0.982\n  .. .. .. .. .. ..$ median.abs.corr: num [1:2] 0.994 0.99\n  .. .. .. .. ..$ cs_index          : int [1:2] 1 2\n  .. .. .. .. ..$ coverage          : num [1:2] 0.992 0.958\n  .. .. .. .. ..$ requested_coverage: num 0.95\n  .. .. .. ..$ cs_corr       : num [1:2, 1:2] 1 0.64 0.64 1\n  .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. ..$ sets_secondary:List of 2\n  .. .. .. .. ..$ coverage_0.7:List of 2\n  .. .. .. .. .. ..$ sets   :List of 5\n  .. .. .. .. .. .. ..$ cs                :List of 2\n  .. .. .. .. .. .. .. ..$ L1: int [1:2] 7797 7798\n  .. .. .. .. .. .. .. ..$ L2: int [1:3] 7725 7735 7770\n  .. .. .. .. .. .. ..$ purity            :'data.frame':\t2 obs. of  3 variables:\n  .. .. .. .. .. .. .. ..$ min.abs.corr   : num [1:2] 0.994 0.954\n  .. .. .. .. .. .. .. ..$ mean.abs.corr  : num [1:2] 0.994 0.968\n  .. .. .. .. .. .. .. ..$ median.abs.corr: num [1:2] 0.994 0.96\n  .. .. .. .. .. .. ..$ cs_index          : int [1:2] 1 2\n  .. .. .. .. .. .. ..$ coverage          : num [1:2] 0.992 0.73\n  .. .. .. .. .. .. ..$ requested_coverage: num 0.7\n  .. .. .. .. .. ..$ cs_corr: num [1:2, 1:2] 1 0.64 0.64 1\n  .. .. .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. ..$ coverage_0.5:List of 2\n  .. .. .. .. .. ..$ sets   :List of 5\n  .. .. .. .. .. .. ..$ cs                :List of 2\n  .. .. .. .. .. .. .. ..$ L1: int 7797\n  .. .. .. .. .. .. .. ..$ L2: int [1:2] 7725 7735\n  .. .. .. .. .. .. ..$ purity            :'data.frame':\t2 obs. of  3 variables:\n  .. .. .. .. .. .. .. ..$ min.abs.corr   : num [1:2] 1 0.96\n  .. .. .. .. .. .. .. ..$ mean.abs.corr  : num [1:2] 1 0.96\n  .. .. .. .. .. .. .. ..$ median.abs.corr: num [1:2] 1 0.96\n  .. .. .. .. .. .. ..$ cs_index          : int [1:2] 1 2\n  .. .. .. .. .. .. ..$ coverage          : num [1:2] 0.558 0.629\n  .. .. .. .. .. .. ..$ requested_coverage: num 0.5\n  .. .. .. .. .. ..$ cs_corr: num [1:2, 1:2] 1 0.64 0.64 1\n  .. .. .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. .. .. .. .. ..$ : chr [1:2] \"L1\" \"L2\"\n  .. .. .. ..$ alpha         : num [1:2, 1:10779] 2.29e-16 1.63e-10 2.09e-16 1.72e-10 2.07e-16 ...\n  .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. ..$ : NULL\n  .. .. .. .. .. ..$ : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. .. ..$ lbf_variable  : num [1:2, 1:10779] -1.97 -1.77 -2.06 -1.72 -2.07 ...\n  .. .. .. .. ..- attr(*, \"dimnames\")=List of 2\n  .. .. .. .. .. ..$ : NULL\n  .. .. .. .. .. ..$ : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. .. ..$ V             : num [1:2] 8.98e-05 5.29e-05\n  .. .. .. ..$ niter         : int 13\n  .. .. .. ..$ max_L         : int 5\n  .. .. .. ..- attr(*, \"class\")= chr \"susie\"\n  .. .. ..$ outlier_number      : int 0\n  .. ..$ rss_data_analyzed             :'data.frame':\t10779 obs. of  26 variables:\n  .. .. ..$ chrom                  : chr [1:10779] \"11\" \"11\" \"11\" \"11\" ...\n  .. .. ..$ pos                    : int [1:10779] 84267999 84268181 84268207 84268259 84268265 84268309 84268394 84268618 84268622 84268650 ...\n  .. .. ..$ variant_id             : chr [1:10779] \"11:84267999:T:C\" \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. ..$ A1                     : chr [1:10779] \"C\" \"C\" \"C\" \"T\" ...\n  .. .. ..$ A2                     : chr [1:10779] \"T\" \"CTG\" \"T\" \"C\" ...\n  .. .. ..$ Var.x                  : num [1:10779] 0.131 NA NA NA NA ...\n  .. .. ..$ raiss_ld_score.x       : num [1:10779] 35 NA NA NA NA ...\n  .. .. ..$ raiss_R2.x             : num [1:10779] 0.869 NA NA NA NA ...\n  .. .. ..$ Var.y                  : num [1:10779] NA -1 -1 -1 -1 -1 -1 -1 -1 -1 ...\n  .. .. ..$ raiss_ld_score.y       : num [1:10779] NA Inf Inf Inf Inf ...\n  .. .. ..$ raiss_R2.y             : num [1:10779] NA NA NA NA NA NA NA NA NA NA ...\n  .. .. ..$ pvalue                 : num [1:10779] NA 0.626 0.635 0.581 0.121 ...\n  .. .. ..$ effect_allele_frequency: num [1:10779] NA 0.9965 0.9641 0.0006 0.035 ...\n  .. .. ..$ odds_ratio             : num [1:10779] NA 1.039 0.99 0.875 0.966 ...\n  .. .. ..$ ci_lower               : num [1:10779] NA 0.891 0.948 0.544 0.924 0.532 0.985 0.941 0.968 0.982 ...\n  .. .. ..$ ci_upper               : num [1:10779] NA 1.21 1.03 1.41 1.01 ...\n  .. .. ..$ n_case                 : num [1:10779] NA 83196 85934 69576 85934 ...\n  .. .. ..$ n_control              : num [1:10779] NA 386481 401577 360279 401577 ...\n  .. .. ..$ het_isq                : num [1:10779] NA 0 52.6 69.8 0 0 2.1 0 51.5 0 ...\n  .. .. ..$ het_pvalue             : num [1:10779] NA 0.4727 0.0164 0.0686 0.6535 ...\n  .. .. ..$ beta                   : num [1:10779] NA NA NA NA NA NA NA NA NA NA ...\n  .. .. ..$ se                     : num [1:10779] NA NA NA NA NA NA NA NA NA NA ...\n  .. .. ..$ variants_id_original   : chr [1:10779] NA \"11:84268181:CTG:C\" \"11:84268207:T:C\" \"11:84268259:C:T\" ...\n  .. .. ..$ Var                    : num [1:10779] NA -1 -1 -1 -1 -1 -1 -1 -1 -1 ...\n  .. .. ..$ raiss_ld_score         : num [1:10779] NA Inf Inf Inf Inf ...\n  .. .. ..$ z                      : num [1:10779] 0.463 -0.487 0.475 -0.552 -1.549 ..\n  \n```  "
    ]
   },
   {
@@ -79,15 +79,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "sos run pipeline/rss_analysis.ipynb univariate_rss \\\n",
-    "    --ld-meta-data data/ld_meta_file_with_bim.tsv \\\n",
-    "    --gwas-meta-data data/mnm_regression/gwas_meta_data.txt \\\n",
-    "    --qc_method \"rss_qc\" --impute \\\n",
-    "    --finemapping_method \"susie_rss\" \\\n",
+    "# Build per-(study, LD-block) GwasSumStats, run SuSiE-RSS fine-mapping,\n",
+    "# and render PIP plots end-to-end with the new pecotmr-integration pipeline:\n",
+    "sos run pipeline/gwas_rss_fine_mapping.ipynb \\\n",
+    "    generate_gwas_sumstats+gwas_fine_mapping+gwas_rss_plot \\\n",
     "    --cwd output/rss_analysis \\\n",
-    "    --skip_analysis_pip_cutoff 0 \\\n",
-    "    --skip_regions 6:25000000-35000000 \\\n",
-    "    --region_name 22:49355984-50799822"
+    "    --gwas-meta data/mnm_regression/gwas_meta_data.txt \\\n",
+    "    --regions chr22:49355984-50799822 \\\n",
+    "    --ld-meta data/ld_meta_file_with_bim.tsv \\\n",
+    "    --qc-method slalom --impute"
    ]
   },
   {
diff --git a/code/SoS/mnm_analysis/univariate_fine_mapping_fsusie_vignette.ipynb b/code/SoS/mnm_analysis/univariate_fine_mapping_fsusie_vignette.ipynb
index 33da62df..5a05681e 100644
--- a/code/SoS/mnm_analysis/univariate_fine_mapping_fsusie_vignette.ipynb
+++ b/code/SoS/mnm_analysis/univariate_fine_mapping_fsusie_vignette.ipynb
@@ -73,8 +73,8 @@
    "id": "436a39bb-b182-49c0-9ec9-c4b899c8b684",
    "metadata": {},
    "source": [
-    "- `{name}.chr{N}_{start}_{end}.{n}_marks.dataset.rds` — the per-region functional dataset (the epigenomic marks, residualized phenotype and genotype matrices) prepared for fSuSiE.\n",
-    "- `{name}.chr{N}_..._top_pc_weights.rds` — the fitted fSuSiE result with top-PC TWAS weights, fine-mapping the functional (epigenomic) signal in that region.\n",
+    "- `{name}.chr{N}_{start}_{end}.{n}_marks.dataset.rds` \u2014 the per-region functional dataset (the epigenomic marks, residualized phenotype and genotype matrices) prepared for fSuSiE.\n",
+    "- `{name}.chr{N}_..._top_pc_weights.rds` \u2014 the fitted fSuSiE result with top-PC TWAS weights, fine-mapping the functional (epigenomic) signal in that region.\n",
     "\n",
     "Inspecting either object shows its internal structure:\n",
     "```\n",
@@ -113,7 +113,7 @@
     "  .. .. ..- attr(*, \"names\")= chr [1:15857] \"chr7:139302775:AACACACACAC:AACACACACACAC\" \"chr7:139302775:AACACACACAC:AACACACACACACAC\" \"chr7:139304706:G:GGT\" \"chr7:139305695:G:T\" ...\n",
     "  ..$ grange           : chr [1:2] \"139293693\" \"145380632\"\n",
     "  ..$ Y_coordinates    :List of 1\n",
-    "  .. ..$ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':\t166 obs. of  3 variables:\n",
+    "  .. ..$ :Classes \u2018tbl_df\u2019, \u2018tbl\u2019 and 'data.frame':\t166 obs. of  3 variables:\n",
     "  .. .. ..$ #chr : chr [1:166] \"chr7\" \"chr7\" \"chr7\" \"chr7\" ...\n",
     "  .. .. ..$ start: num [1:166] 1.39e+08 1.39e+08 1.39e+08 1.39e+08 1.39e+08 ...\n",
     "  .. .. ..$ end  : num [1:166] 1.39e+08 1.39e+08 1.39e+08 1.39e+08 1.39e+08 ...\n",
@@ -146,7 +146,7 @@
     "  .. .. ..- attr(*, \"names\")= chr [1:5] \"user.self\" \"sys.self\" \"elapsed\" \"user.child\" ...\n",
     "  .. ..$ fsusie_result             :List of 34\n",
     "  .. .. ..- attr(*, \"class\")= chr \"susiF\"\n",
-    "  .. ..$ Y_coordinates             :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':\t166 obs. of  3 variables:\n",
+    "  .. ..$ Y_coordinates             :Classes \u2018tbl_df\u2019, \u2018tbl\u2019 and 'data.frame':\t166 obs. of  3 variables:\n",
     "  .. ..$ fsusie_summary            :List of 5\n",
     "  .. ..$ region_info               :List of 3\n",
     "```"
@@ -175,17 +175,19 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "sos run pipeline/mnm_regression.ipynb fsusie \\\n",
-    "    --cwd output/fsusie/ \\\n",
-    "    --name   Mic  \\\n",
-    "    --genoFile data/fsusie/mwe.genotype_by_chrom_files.txt \\\n",
-    "    --phenoFile data/fsusie/mwe.pheno.region_list \\\n",
-    "    --covFile   data/fsusie/mwe.chr7_139293693_145380632.Marchenko_PC.anon.gz \\\n",
-    "    --cis-window 0 --max-cv-variants 5000 \\\n",
-    "    --susie_top_pc 0 --phenotype-names ROSMAP_Mic_snATACQTL --maf 0.01 \\\n",
-    "    --save-data \\\n",
-    "    --numThreads 8 \\\n",
-    "    --post_processing \"none\" --small-sample-correction "
+    "# Step 1: build the QtlDataset RDS for the functional phenotype.\n",
+    "sos run pipeline/qtl_dataset.ipynb qtl_dataset_construct \\\n",
+    "    --cwd output/qtl_dataset \\\n",
+    "    --study Mic \\\n",
+    "    --genotype-prefix data/fsusie/mwe.chr7 \\\n",
+    "    --phenotype-manifest data/fsusie/mwe.pheno_manifest.tsv\n",
+    "\n",
+    "# Step 2: per-region fSuSiE joint fine-mapping over the bins in the LD block.\n",
+    "sos run pipeline/functional_fine_mapping.ipynb functional_fine_mapping \\\n",
+    "    --cwd output/fsusie \\\n",
+    "    --study Mic \\\n",
+    "    --qtl-dataset output/qtl_dataset/Mic.qtl_dataset.rds \\\n",
+    "    --regions chr7:139293693-145380632"
    ]
   },
   {
@@ -256,4 +258,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}
\ No newline at end of file
diff --git a/code/SoS/mnm_analysis/univariate_fine_mapping_twas_vignette.ipynb b/code/SoS/mnm_analysis/univariate_fine_mapping_twas_vignette.ipynb
index cf0118e0..6072cd50 100644
--- a/code/SoS/mnm_analysis/univariate_fine_mapping_twas_vignette.ipynb
+++ b/code/SoS/mnm_analysis/univariate_fine_mapping_twas_vignette.ipynb
@@ -17,7 +17,7 @@
     "\n",
     "This pipeline performs per-region univariate fine-mapping with SuSiE and computes TWAS\n",
     "weights, via the `susie_twas` workflow of `pipeline/mnm_regression.ipynb`. For each\n",
-    "region the workflow fits SuSiE on the covariate-residualized phenotype × genotype\n",
+    "region the workflow fits SuSiE on the covariate-residualized phenotype \u00d7 genotype\n",
     "matrix and then cross-validates five prediction models (**enet**, **lasso**,\n",
     "**bayes_r**, **mrash**, **susie**) to pick TWAS weights.\n",
     "\n",
@@ -32,10 +32,10 @@
     "`start/end`). One `susie_twas` call fits one SuSiE + TWAS model per region.\n",
     "\n",
     "* **Gene-based phenotypes.** Manifest row = one gene; window = gene's containing TAD (or\n",
-    "  ± cis-window around the TSS).\n",
+    "  \u00b1 cis-window around the TSS).\n",
     "* **Peak-based phenotypes (caQTL, mQTL).** Manifest row = one peak/CpG; window = the\n",
     "  peak's containing [extended-TAD](./../reference_data/generalized_TADB.html). A peak is\n",
-    "  2–5 kb and cannot serve as its own window.\n"
+    "  2\u20135 kb and cannot serve as its own window.\n"
    ]
   },
   {
@@ -53,9 +53,9 @@
     "\n",
     "The pipeline expects two derived files per experiment:\n",
     "\n",
-    "* `pheno_manifest.tsv` — 5 columns (`#chr start end ID path`), one row per region.\n",
+    "* `pheno_manifest.tsv` \u2014 5 columns (`#chr start end ID path`), one row per region.\n",
     "  `path` points to a bgzipped+tabixed phenotype BED restricted to that region.\n",
-    "* `association_windows.bed` — 4 columns (`chr start end ID`), one row per region.\n",
+    "* `association_windows.bed` \u2014 4 columns (`chr start end ID`), one row per region.\n",
     "  `start/end` define the cis search window; `ID` must match a row in the manifest.\n",
     "\n",
     "For gene-based phenotypes these files are produced by the standard phenotype\n",
@@ -99,18 +99,27 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "sos run pipeline/mnm_regression.ipynb susie_twas \\\n",
-    "    --name        test_susie_twas \\\n",
-    "    --cwd         output/mnm_regression/susie_twas \\\n",
-    "    --genoFile    output/genotype_by_chrom/wgs.merged.plink_qc.1.bed \\\n",
-    "    --phenoFile   output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \\\n",
-    "    --covFile     output/covariate/bulk_rnaseq_tmp_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \\\n",
-    "    --customized-association-windows reference_data/TAD/TADB_enhanced_cis.bed \\\n",
-    "    --phenotype-names test_pheno \\\n",
-    "    --max-cv-variants 5000 \\\n",
-    "    --ld_reference_meta_file data/ld_meta_file_with_bim.tsv \\\n",
-    "    --region-name ENSG00000049246 ENSG00000054116 ENSG00000116678 \\\n",
-    "    --save-data\n"
+    "# Step 1: build the QtlDataset RDS.\n",
+    "sos run pipeline/qtl_dataset.ipynb qtl_dataset_construct \\\n",
+    "    --cwd output/qtl_dataset \\\n",
+    "    --study test_susie_twas \\\n",
+    "    --genotype-prefix output/genotype_by_chrom/wgs.merged.plink_qc.chr1 \\\n",
+    "    --phenotype-manifest data/test_susie_twas.pheno_manifest.tsv\n",
+    "\n",
+    "# Step 2: per-region univariate SuSiE fine-mapping.\n",
+    "sos run pipeline/fine_mapping.ipynb fine_mapping \\\n",
+    "    --cwd output/fine_mapping \\\n",
+    "    --study test_susie_twas \\\n",
+    "    --qtl-dataset output/qtl_dataset/test_susie_twas.qtl_dataset.rds \\\n",
+    "    --regions chr1:1000000-2000000\n",
+    "\n",
+    "# Step 3: TWAS weights (default per-method panel + SR-TWAS ensemble inline).\n",
+    "sos run pipeline/twas_weights.ipynb twas_weights \\\n",
+    "    --cwd output/twas_weights \\\n",
+    "    --study test_susie_twas \\\n",
+    "    --qtl-dataset output/qtl_dataset/test_susie_twas.qtl_dataset.rds \\\n",
+    "    --regions chr1:1000000-2000000 \\\n",
+    "    --fine-mapping-result output/fine_mapping/test_susie_twas.chr1_1000000_2000000.finemap.rds"
    ]
   },
   {
@@ -126,16 +135,24 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "sos run pipeline/mnm_regression.ipynb susie_twas \\\n",
-    "    --name        example_10peaks_twas \\\n",
-    "    --cwd         output/susie_twas_out \\\n",
-    "    --genoFile    input/example.chr22.bed \\\n",
-    "    --phenoFile   output/inputs_ready/pheno_manifest.tsv \\\n",
-    "    --covFile     input/example_covariates.tsv \\\n",
-    "    --customized-association-windows output/inputs_ready/association_windows.bed \\\n",
-    "    --phenotype-names caQTL \\\n",
-    "    --max-cv-variants 5000 \\\n",
-    "    --save-data\n"
+    "# Smaller example: same chain, different data.\n",
+    "sos run pipeline/qtl_dataset.ipynb qtl_dataset_construct \\\n",
+    "    --cwd output/qtl_dataset \\\n",
+    "    --study example_10peaks_twas \\\n",
+    "    --genotype-prefix input/example.chr22 \\\n",
+    "    --phenotype-manifest output/inputs_ready/pheno_manifest.tsv\n",
+    "\n",
+    "sos run pipeline/fine_mapping.ipynb fine_mapping \\\n",
+    "    --cwd output/fine_mapping \\\n",
+    "    --study example_10peaks_twas \\\n",
+    "    --qtl-dataset output/qtl_dataset/example_10peaks_twas.qtl_dataset.rds \\\n",
+    "    --regions chr22:15000000-17000000\n",
+    "\n",
+    "sos run pipeline/twas_weights.ipynb twas_weights \\\n",
+    "    --cwd output/twas_weights \\\n",
+    "    --study example_10peaks_twas \\\n",
+    "    --qtl-dataset output/qtl_dataset/example_10peaks_twas.qtl_dataset.rds \\\n",
+    "    --regions chr22:15000000-17000000"
    ]
   },
   {
@@ -155,7 +172,7 @@
     "...\n",
     "INFO: susie_twas output:   <cwd>/fine_mapping/<name>.<chrom>_<ID>.univariate_bvsr.rds\n",
     "                           <cwd>/twas_weights/<name>.<chrom>_<ID>.univariate_twas_weights.rds\n",
-    "                           ...  (2 × N_regions items)\n",
+    "                           ...  (2 \u00d7 N_regions items)\n",
     "INFO: Workflow susie_twas is executed successfully.\n",
     "```\n"
    ]
@@ -173,7 +190,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "sos run pipeline/mnm_regression.ipynb susie_twas -h\n"
+    "sos run pipeline/fine_mapping.ipynb -h\n",
+    "sos run pipeline/twas_weights.ipynb -h"
    ]
   },
   {
@@ -182,29 +200,29 @@
    "source": [
     "Flags consumed by `susie_twas`:\n",
     "\n",
-    "* `--genoFile` — path to a PLINK1 `.bed` file (pass the `.bed` path, not the prefix).\n",
+    "* `--genoFile` \u2014 path to a PLINK1 `.bed` file (pass the `.bed` path, not the prefix).\n",
     "  FID/IID in the `.fam` must match the phenotype/covariate sample IDs.\n",
-    "* `--phenoFile` — 5-column `#chr start end ID path` manifest. One row per region; `path`\n",
+    "* `--phenoFile` \u2014 5-column `#chr start end ID path` manifest. One row per region; `path`\n",
     "  points at a bgzipped+tabixed phenotype BED restricted to that region. Pass multiple\n",
     "  files for multi-condition analyses.\n",
-    "* `--covFile` — tab-delimited, literal `ID` header in column 1; covariate names in rows,\n",
+    "* `--covFile` \u2014 tab-delimited, literal `ID` header in column 1; covariate names in rows,\n",
     "  sample IDs across the header row.\n",
-    "* `--customized-association-windows` — 4-column `chr start end ID` BED. Required unless\n",
+    "* `--customized-association-windows` \u2014 4-column `chr start end ID` BED. Required unless\n",
     "  `--cis-window` is non-negative.\n",
-    "* `--cis-window` — symmetric cis radius (bp) around each region's `start/end`. Default\n",
+    "* `--cis-window` \u2014 symmetric cis radius (bp) around each region's `start/end`. Default\n",
     "  `-1` (forces the customized-windows path).\n",
-    "* `--phenotype-names` — condition name(s), one per `--phenoFile`.\n",
-    "* `--max-cv-variants` — cap on variants used in cross-validation (default 5,000).\n",
-    "* `--ld_reference_meta_file` — summary-stats / RSS mode only; omit for individual-level.\n",
-    "* `--region-name` — restrict to one or more region IDs for sanity checks.\n",
-    "* `--save-data` — also emit `*.univariate_data.rds` with the residualized X/Y matrices.\n"
+    "* `--phenotype-names` \u2014 condition name(s), one per `--phenoFile`.\n",
+    "* `--max-cv-variants` \u2014 cap on variants used in cross-validation (default 5,000).\n",
+    "* `--ld_reference_meta_file` \u2014 summary-stats / RSS mode only; omit for individual-level.\n",
+    "* `--region-name` \u2014 restrict to one or more region IDs for sanity checks.\n",
+    "* `--save-data` \u2014 also emit `*.univariate_data.rds` with the residualized X/Y matrices.\n"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Output format\n\nAll results are written under `--cwd` (e.g. `output/mnm_regression/susie_twas/twas_weights/`), one set of files per region:\n\n- `{name}.{chr}_{region_id}.univariate_twas_weights.rds` — the fitted SuSiE fine-mapping result plus the trained TWAS weights for that region/gene; this is the main output consumed by downstream TWAS.\n- a companion fine-mapping RDS keyed by region `ID` and phenotype condition (and a third data RDS when `--save-data` is set), holding the per-region SuSiE posterior summaries and credible sets.\n"
+    "## Output format\n\nAll results are written under `--cwd` (e.g. `output/mnm_regression/susie_twas/twas_weights/`), one set of files per region:\n\n- `{name}.{chr}_{region_id}.univariate_twas_weights.rds` \u2014 the fitted SuSiE fine-mapping result plus the trained TWAS weights for that region/gene; this is the main output consumed by downstream TWAS.\n- a companion fine-mapping RDS keyed by region `ID` and phenotype condition (and a third data RDS when `--save-data` is set), holding the per-region SuSiE posterior summaries and credible sets.\n"
    ]
   },
   {
@@ -213,7 +231,7 @@
    "source": [
     "## Anticipated Results\n",
     "\n",
-    "Quick sanity check — inspect cross-validation performance for the first region:"
+    "Quick sanity check \u2014 inspect cross-validation performance for the first region:"
    ]
   },
   {
@@ -222,12 +240,14 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Inspect the new S4 TwasWeights collection via the accessor surface:\n",
     "Rscript -e '\n",
-    "fp <- \"output/mnm_regression/susie_twas/twas_weights/test_susie_twas.chr1_ENSG00000049246.univariate_twas_weights.rds\"\n",
-    "x  <- readRDS(fp)\n",
-    "cat(\"methods:\", names(x[[1]][[1]]$twas_weights), \"\\n\")\n",
-    "print(x[[1]][[1]]$twas_cv_result$performance)\n",
-    "'\n"
+    "suppressPackageStartupMessages(library(pecotmr))\n",
+    "tw <- readRDS(\"output/twas_weights/test_susie_twas.chr1_1000000_2000000.twas_weights.rds\")\n",
+    "cat(\"methods present:\", paste(unique(as.character(tw$method)), collapse = \", \"), \"\\n\")\n",
+    "cat(\"rows:\", nrow(tw), \"\\n\")\n",
+    "print(getR2(tw))   # per-method CV R^2 for this region\n",
+    "'"
    ]
   },
   {
@@ -236,16 +256,16 @@
    "source": [
     "Interpretation:\n",
     "\n",
-    "* `twas_weights$<method>` — use for TWAS (`Z ≈ weights · LD · GWAS z-scores`).\n",
-    "* `twas_cv_result$performance` — pick the method with highest `adj_rsq` before applying.\n",
-    "* `top_loci` in `univariate_bvsr.rds` — lead variants and credible-set coverages.\n",
-    "* `susie_result_trimmed$sets$cs` — indices of variants in each 95% credible set.\n",
+    "* `twas_weights$<method>` \u2014 use for TWAS (`Z \u2248 weights \u00b7 LD \u00b7 GWAS z-scores`).\n",
+    "* `twas_cv_result$performance` \u2014 pick the method with highest `adj_rsq` before applying.\n",
+    "* `top_loci` in `univariate_bvsr.rds` \u2014 lead variants and credible-set coverages.\n",
+    "* `susie_result_trimmed$sets$cs` \u2014 indices of variants in each 95% credible set.\n",
     "\n",
     "### Common pitfalls\n",
     "\n",
     "* **Sample-ID drift.** The phenotype BED's header columns (5+), the bfile FID/IID, and\n",
     "  the covariate TSV header must all be drawn from the same set. Sos does not warn on\n",
-    "  missing samples — it silently drops them and may report zero overlap.\n",
+    "  missing samples \u2014 it silently drops them and may report zero overlap.\n",
     "* **Empty regions.** If a region has no variants in its association window (e.g. the\n",
     "  manifest points at a window with no bfile coverage) sos marks it completed with an\n",
     "  empty output. Verify the expected file count in the `INFO: susie_twas output:` line.\n",
diff --git a/code/SoS/pecotmr_integration/colocboost.ipynb b/code/SoS/pecotmr_integration/colocboost.ipynb
index f0e2e4b0..c66822cf 100644
--- a/code/SoS/pecotmr_integration/colocboost.ipynb
+++ b/code/SoS/pecotmr_integration/colocboost.ipynb
@@ -37,7 +37,27 @@
    },
    "outputs": [],
    "source": [
-    "[colocboost]\nif bool(genes) == bool(regions):\n    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\nif (joint_gwas or separate_gwas) and not gwas_sumstats.is_file():\n    raise ValueError(\"--joint-gwas / --separate-gwas require --gwas-sumstats pointing at a QC'd GwasSumStats RDS.\")\nif not (xqtl_coloc or joint_gwas or separate_gwas):\n    raise ValueError(\"Enable at least one of xqtl_coloc, joint_gwas, separate_gwas.\")\nfanout_items = genes if genes else regions\nfanout_kind  = 'gene' if genes else 'region'\ndef _fanout_safe(s):\n    return s.replace(':', '_').replace('-', '_')\ninput: qtl_dataset, for_each = 'fanout_items'\noutput: f\"{cwd}/{study}.{_fanout_safe(_fanout_items)}.colocboost.rds\"\ntask: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\nbash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n    Rscript ${modular_script_dir}/pecotmr_integration/colocboost.R \\\n        --qtl-dataset ${_input} \\\n        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n        ${'--gwas-sumstats ' + str(gwas_sumstats) if gwas_sumstats.is_file() else ''} \\\n        ${'--no-xqtl-coloc' if not xqtl_coloc else ''} \\\n        ${'--joint-gwas' if joint_gwas else ''} \\\n        ${'--separate-gwas' if separate_gwas else ''} \\\n        --output ${_output}\n"
+    "[colocboost]\n",
+    "if bool(genes) == bool(regions):\n",
+    "    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\n",
+    "if (joint_gwas or separate_gwas) and not gwas_sumstats.is_file():\n",
+    "    raise ValueError(\"--joint-gwas / --separate-gwas require --gwas-sumstats pointing at a QC'd GwasSumStats RDS.\")\n",
+    "if not (xqtl_coloc or joint_gwas or separate_gwas):\n",
+    "    raise ValueError(\"Enable at least one of xqtl_coloc, joint_gwas, separate_gwas.\")\n",
+    "fanout_items = genes if genes else regions\n",
+    "fanout_kind  = 'gene' if genes else 'region'\n",
+    "input: qtl_dataset, for_each = 'fanout_items'\n",
+    "output: f\"{cwd}/{study}.{_fanout_items.replace(':', '_').replace('-', '_')}.colocboost.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/colocboost.R \\\n",
+    "        --qtl-dataset ${_input} \\\n",
+    "        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n",
+    "        ${'--gwas-sumstats ' + str(gwas_sumstats) if gwas_sumstats.is_file() else ''} \\\n",
+    "        ${'--no-xqtl-coloc' if not xqtl_coloc else ''} \\\n",
+    "        ${'--joint-gwas' if joint_gwas else ''} \\\n",
+    "        ${'--separate-gwas' if separate_gwas else ''} \\\n",
+    "        --output ${_output}\n"
    ]
   }
  ],
diff --git a/code/SoS/pecotmr_integration/fine_mapping.ipynb b/code/SoS/pecotmr_integration/fine_mapping.ipynb
index 31f36f25..9c25c57d 100644
--- a/code/SoS/pecotmr_integration/fine_mapping.ipynb
+++ b/code/SoS/pecotmr_integration/fine_mapping.ipynb
@@ -5,7 +5,7 @@
    "metadata": {
     "kernel": "SoS"
    },
-   "source": "# SuSiE fine-mapping (QTL or GWAS)\n\n## Description\n\nTwo workflows, both backed by `pecotmr::fineMappingPipeline` (which dispatches on the S4 input class):\n\n- **`fine_mapping`** &mdash; per-gene / per-region fine-mapping over a pre-built `QtlDataset` RDS. Fan-out goes by `--genes` (list of trait IDs, gene mode) **or** `--regions` (list of `chr:start-end` strings, region mode); exactly one must be supplied. Each task loads the same QtlDataset RDS and fine-maps one unit.\n- **`fine_mapping_gwas`** &mdash; per-block fine-mapping over a list of per-block `GwasSumStats` RDS paths (typically produced by `gwas_sumstats_construct.R` / `gwas_sumstats.ipynb`). One task per RDS; each task produces one `GwasFineMappingResult` RDS. Downstream consumers (e.g. `enloc.ipynb`) concatenate the per-block outputs as needed.\n\n## Inputs\n\n### `fine_mapping` (QTL)\n\n- `--qtl-dataset` &mdash; path to the QtlDataset RDS produced by `qtl_dataset.ipynb`.\n- `--genes ID1 ID2 …` **or** `--regions chr:start-end …` &mdash; fan-out targets (mutually exclusive).\n- `--cis-window` &mdash; bp window around each gene's TSS (gene mode only). Default 1,000,000.\n- `--methods` &mdash; comma-separated fine-mapping method tokens. Default `susie`.\n- `--coverage` &mdash; SuSiE credible-set coverage. Default 0.95.\n\n### `fine_mapping_gwas` (GWAS)\n\n- `--gwas-sumstats-list S1.rds S2.rds …` &mdash; per-block GwasSumStats RDS paths to fan out over.\n- `--methods` &mdash; comma-separated fine-mapping method tokens. Default `susie`.\n- `--coverage` &mdash; SuSiE credible-set coverage. Default 0.95.\n\n### Both\n\n- `--cwd` &mdash; output directory. Default `output`.\n- `--study` &mdash; study label used in the output filename.\n- `--modular-script-dir` &mdash; directory containing the per-task worker R scripts. Default `code/script`; override when SoS is invoked from a working directory that doesn't contain them.\n\n## Outputs\n\n- `fine_mapping`: `{cwd}/{study}.{gene|region}.finemap.rds` per fan-out unit (region strings are sanitised: `:` and `-` become `_`).\n- `fine_mapping_gwas`: `{cwd}/{study}.{bn-of-gwas-sumstats-rds}.gwas_finemap.rds` per input RDS.\n"
+   "source": "# SuSiE fine-mapping (QTL or GWAS)\n\n## Description\n\nTwo workflows, both backed by `pecotmr::fineMappingPipeline` (which dispatches on the S4 input class):\n\n- **`fine_mapping`** &mdash; per-gene / per-region fine-mapping over a pre-built `QtlDataset` RDS. Fan-out goes by `--genes` (list of trait IDs, gene mode) **or** `--regions` (list of `chr:start-end` strings, region mode); exactly one must be supplied. Each task loads the same QtlDataset RDS and fine-maps one unit.\n- **`fine_mapping_gwas`** &mdash; per-block fine-mapping over a list of per-block `GwasSumStats` RDS paths (typically produced by `gwas_sumstats_construct.R` / `gwas_sumstats.ipynb`). One task per RDS; each task produces one `GwasFineMappingResult` RDS. Downstream consumers (e.g. `enloc.ipynb`) concatenate the per-block outputs as needed.\n\n## Inputs\n\n### `fine_mapping` (QTL)\n\n- `--qtl-dataset` &mdash; path to the QtlDataset RDS produced by `qtl_dataset.ipynb`.\n- `--genes ID1 ID2 \u2026` **or** `--regions chr:start-end \u2026` &mdash; fan-out targets (mutually exclusive).\n- `--cis-window` &mdash; bp window around each gene's TSS (gene mode only). Default 1,000,000.\n- `--methods` &mdash; comma-separated fine-mapping method tokens. Default `susie`.\n- `--coverage` &mdash; SuSiE credible-set coverage. Default 0.95.\n\n### `fine_mapping_gwas` (GWAS)\n\n- `--gwas-sumstats-list S1.rds S2.rds \u2026` &mdash; per-block GwasSumStats RDS paths to fan out over.\n- `--methods` &mdash; comma-separated fine-mapping method tokens. Default `susie`.\n- `--coverage` &mdash; SuSiE credible-set coverage. Default 0.95.\n\n### Both\n\n- `--cwd` &mdash; output directory. Default `output`.\n- `--study` &mdash; study label used in the output filename.\n- `--modular-script-dir` &mdash; directory containing the per-task worker R scripts. Default `code/script`; override when SoS is invoked from a working directory that doesn't contain them.\n\n## Outputs\n\n- `fine_mapping`: `{cwd}/{study}.{gene|region}.finemap.rds` per fan-out unit (region strings are sanitised: `:` and `-` become `_`).\n- `fine_mapping_gwas`: `{cwd}/{study}.{bn-of-gwas-sumstats-rds}.gwas_finemap.rds` per input RDS.\n"
   },
   {
    "cell_type": "markdown",
@@ -30,7 +30,25 @@
     "kernel": "SoS"
    },
    "outputs": [],
-   "source": "[fine_mapping]\nif not qtl_dataset.is_file():\n    raise ValueError(\"fine_mapping requires --qtl-dataset to point at an existing QtlDataset RDS.\")\nif bool(genes) == bool(regions):\n    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\nfanout_items = genes if genes else regions\nfanout_kind  = 'gene' if genes else 'region'\ndef _fanout_safe(s):\n    return s.replace(':', '_').replace('-', '_')\ninput: qtl_dataset, for_each = 'fanout_items'\noutput: f\"{cwd}/{study}.{_fanout_safe(_fanout_items)}.finemap.rds\"\ntask: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\nbash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping.R \\\n        --qtl-dataset ${_input} \\\n        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n        --methods ${methods} \\\n        --coverage ${coverage} \\\n        --output ${_output}\n"
+   "source": [
+    "[fine_mapping]\n",
+    "if not qtl_dataset.is_file():\n",
+    "    raise ValueError(\"fine_mapping requires --qtl-dataset to point at an existing QtlDataset RDS.\")\n",
+    "if bool(genes) == bool(regions):\n",
+    "    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\n",
+    "fanout_items = genes if genes else regions\n",
+    "fanout_kind  = 'gene' if genes else 'region'\n",
+    "input: qtl_dataset, for_each = 'fanout_items'\n",
+    "output: f\"{cwd}/{study}.{_fanout_items.replace(':', '_').replace('-', '_')}.finemap.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping.R \\\n",
+    "        --qtl-dataset ${_input} \\\n",
+    "        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n",
+    "        --methods ${methods} \\\n",
+    "        --coverage ${coverage} \\\n",
+    "        --output ${_output}\n"
+   ]
   },
   {
    "cell_type": "code",
diff --git a/code/SoS/pecotmr_integration/fine_mapping_postprocessing.ipynb b/code/SoS/pecotmr_integration/fine_mapping_postprocessing.ipynb
new file mode 100644
index 00000000..a628ad5b
--- /dev/null
+++ b/code/SoS/pecotmr_integration/fine_mapping_postprocessing.ipynb
@@ -0,0 +1,248 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Fine-mapping postprocessing\n",
+    "\n",
+    "## Description\n",
+    "\n",
+    "Postprocessing utilities for the S4 `QtlFineMappingResult` / `GwasFineMappingResult` RDSes produced by `fine_mapping.ipynb`, `multivariate_fine_mapping.ipynb`, `functional_fine_mapping.ipynb`, or `gwas_rss_fine_mapping.ipynb`. Replaces the legacy `mnm_analysis/mnm_postprocessing.ipynb` (which consumed the older list-shaped RDS layout).\n",
+    "\n",
+    "All steps consume FMR RDSes through the pecotmr S4 accessor surface (`getPip`, `getCs`, `getTopLoci`, `getMarginalEffects`, `getVariantIds`) so they work uniformly for QTL and GWAS outputs.\n",
+    "\n",
+    "Workflow steps (all independent &mdash; pick what you need):\n",
+    "\n",
+    "- **`[export_top_loci]`** &mdash; concatenate every input RDS's top-loci table into one TSV (one row per variant, with `study / context / trait / region_id / method / source` identifier columns).\n",
+    "- **`[export_credible_sets]`** &mdash; same shape, but the per-row CS membership.\n",
+    "- **`[export_pip]`** &mdash; raw `variant_id / pip` long table across all inputs (filterable with `--signal-cutoff`).\n",
+    "- **`[export_marginals]`** &mdash; per-variant marginal effects (`beta / se / z / p`) across all inputs.\n",
+    "- **`[pip_landscape_plot]`** &mdash; per-RDS PIP-vs-position scatter, one facet per FMR row.\n",
+    "- **`[upset_cs_overlap]`** &mdash; UpSet plot of credible-set variant overlap across one or more RDSes (one set per `(study, context, trait, method)` tuple for QTL; per `(study, method, region_id)` for GWAS).\n",
+    "\n",
+    "## Inputs\n",
+    "\n",
+    "- `--fmr-list <RDS> [<RDS> ...]` &mdash; one or more FineMappingResult RDS files (mix QTL and GWAS freely; identifier columns adapt per class).\n",
+    "- `--name` &mdash; output filename prefix.\n",
+    "- `--signal-cutoff` &mdash; PIP cutoff for `export_top_loci` / `export_pip`. Default 0 (no filter); pass 0.025 to mirror the legacy susie default.\n",
+    "- `--max-sets` &mdash; cap on UpSet set count (default 20).\n",
+    "- `--cwd` / `--modular-script-dir` &mdash; standard SoS infra.\n",
+    "\n",
+    "## Outputs\n",
+    "\n",
+    "Per-step:\n",
+    "\n",
+    "- `{cwd}/{name}.topLoci.tsv.gz`\n",
+    "- `{cwd}/{name}.cs.tsv.gz`\n",
+    "- `{cwd}/{name}.pip.tsv.gz`\n",
+    "- `{cwd}/{name}.marginals.tsv.gz`\n",
+    "- `{cwd}/plots/{name}.{rds-basename}.pip_landscape.png` (one per input RDS)\n",
+    "- `{cwd}/plots/{name}.upset.png`\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example\n",
+    "\n",
+    "```bash\n",
+    "# Top-loci TSV + UpSet plot across every fine-mapping RDS in a run:\n",
+    "sos run pipeline/fine_mapping_postprocessing.ipynb \\\n",
+    "    export_top_loci+upset_cs_overlap \\\n",
+    "    --cwd output/postprocessing \\\n",
+    "    --modular-script-dir /path/to/code/script \\\n",
+    "    --name myrun \\\n",
+    "    --signal-cutoff 0.025 \\\n",
+    "    --fmr-list output/fine_mapping/*.finemap.rds\n",
+    "\n",
+    "# All four TSV exports + per-RDS PIP plot:\n",
+    "sos run pipeline/fine_mapping_postprocessing.ipynb \\\n",
+    "    export_top_loci+export_credible_sets+export_pip+export_marginals+pip_landscape_plot \\\n",
+    "    --cwd output/postprocessing --name myrun \\\n",
+    "    --fmr-list output/fine_mapping/*.finemap.rds\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[global]\n",
+    "parameter: cwd = path('output')\n",
+    "parameter: modular_script_dir = path('code/script')\n",
+    "parameter: name = str\n",
+    "# One or more FineMappingResult RDS paths (mix QTL + GWAS freely).\n",
+    "parameter: fmr_list = []\n",
+    "parameter: signal_cutoff = 0.0\n",
+    "parameter: max_sets = 20\n",
+    "parameter: container = ''\n",
+    "parameter: job_size = 1\n",
+    "parameter: walltime = '30m'\n",
+    "parameter: mem = '8G'\n",
+    "parameter: numThreads = 1\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[export_top_loci]\n",
+    "stop_if(not fmr_list, '--fmr-list requires at least one RDS path.')\n",
+    "paths = [str(p) for p in fmr_list]\n",
+    "input: paths\n",
+    "output: f\"{cwd}/{name}.topLoci.tsv.gz\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping_export.R \\\n",
+    "        --input ${_input} \\\n",
+    "        --view topLoci \\\n",
+    "        --signal-cutoff ${signal_cutoff} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[export_credible_sets]\n",
+    "stop_if(not fmr_list, '--fmr-list requires at least one RDS path.')\n",
+    "paths = [str(p) for p in fmr_list]\n",
+    "input: paths\n",
+    "output: f\"{cwd}/{name}.cs.tsv.gz\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping_export.R \\\n",
+    "        --input ${_input} \\\n",
+    "        --view cs \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[export_pip]\n",
+    "stop_if(not fmr_list, '--fmr-list requires at least one RDS path.')\n",
+    "paths = [str(p) for p in fmr_list]\n",
+    "input: paths\n",
+    "output: f\"{cwd}/{name}.pip.tsv.gz\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping_export.R \\\n",
+    "        --input ${_input} \\\n",
+    "        --view pip \\\n",
+    "        --signal-cutoff ${signal_cutoff} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[export_marginals]\n",
+    "stop_if(not fmr_list, '--fmr-list requires at least one RDS path.')\n",
+    "paths = [str(p) for p in fmr_list]\n",
+    "input: paths\n",
+    "output: f\"{cwd}/{name}.marginals.tsv.gz\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping_export.R \\\n",
+    "        --input ${_input} \\\n",
+    "        --view marginals \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[pip_landscape_plot]\n",
+    "stop_if(not fmr_list, '--fmr-list requires at least one RDS path.')\n",
+    "paths = [str(p) for p in fmr_list]\n",
+    "# One PNG per input RDS (per-row fan-out).\n",
+    "input: paths, group_by = 1\n",
+    "output: f\"{cwd}/plots/{name}.{_input:bn}.pip_landscape.png\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping_pip_plot.R \\\n",
+    "        --input ${_input} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[upset_cs_overlap]\n",
+    "stop_if(not fmr_list, '--fmr-list requires at least one RDS path.')\n",
+    "paths = [str(p) for p in fmr_list]\n",
+    "input: paths\n",
+    "output: f\"{cwd}/plots/{name}.upset.png\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping_upset.R \\\n",
+    "        --input ${_input} \\\n",
+    "        --max-sets ${max_sets} \\\n",
+    "        --output ${_output}"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "SoS",
+   "language": "sos",
+   "name": "sos"
+  },
+  "language_info": {
+   "codemirror_mode": "sos",
+   "file_extension": ".sos",
+   "mimetype": "text/x-sos",
+   "name": "sos",
+   "pygments_lexer": "python",
+   "sos": {
+    "kernels": [
+     [
+      "SoS",
+      "sos",
+      "",
+      ""
+     ]
+    ],
+    "version": "0.22.4"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
\ No newline at end of file
diff --git a/code/SoS/pecotmr_integration/functional_fine_mapping.ipynb b/code/SoS/pecotmr_integration/functional_fine_mapping.ipynb
new file mode 100644
index 00000000..31cf4334
--- /dev/null
+++ b/code/SoS/pecotmr_integration/functional_fine_mapping.ipynb
@@ -0,0 +1,141 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Functional fine-mapping (fSuSiE)\n",
+    "\n",
+    "## Description\n",
+    "\n",
+    "Per-region functional fine-mapping via `pecotmr::fineMappingPipeline(qtlDataset, methods = \"fsusie\")`. fSuSiE jointly fine-maps multiple correlated phenotypes (typically epigenomic bins within an LD block) in one region, producing one `QtlFineMappingResult` per fan-out unit. Each task loads the same `QtlDataset` RDS and fits one unit.\n",
+    "\n",
+    "This replaces the `[fsusie]` and `[mvfsusie]` steps of the legacy `mnm_regression.ipynb`. The same `fine_mapping.R` worker that drives univariate SuSiE drives fSuSiE here &mdash; only the `--methods` value differs.\n",
+    "\n",
+    "## Inputs\n",
+    "\n",
+    "- `--qtl-dataset` &mdash; path to the `QtlDataset` RDS produced by `qtl_dataset.ipynb`. Each context's phenotype matrix typically holds multiple correlated bins per region (e.g. epigenomic peaks); fSuSiE fits them jointly.\n",
+    "- `--regions chr:start-end ...` **or** `--genes ID1 ID2 ...` &mdash; fan-out targets (mutually exclusive). Region mode is the standard use.\n",
+    "- `--cis-window` &mdash; bp window (gene mode only). Default 1,000,000.\n",
+    "- `--methods` &mdash; comma-separated method tokens. Default `fsusie`.\n",
+    "- `--coverage` &mdash; credible-set coverage. Default 0.95.\n",
+    "- `--method-args` &mdash; optional JSON object spliced into `fineMappingPipeline()` for fSuSiE-specific knobs (e.g. `prior`, `max_SNP_EM`, `max_scale`, `min_purity`) and other pipeline knobs not exposed as flags.\n",
+    "- `--cwd` &mdash; output directory. Default `output`.\n",
+    "- `--study` &mdash; study label used in the output filename.\n",
+    "- `--modular-script-dir` &mdash; directory holding the per-task R workers. Default `code/script`.\n",
+    "\n",
+    "## Outputs\n",
+    "\n",
+    "- `{cwd}/{study}.{gene|region}.fn_finemap.rds` &mdash; one `QtlFineMappingResult` per fan-out unit. Region strings are sanitised (`:` and `-` become `_`).\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example\n",
+    "\n",
+    "Region mode (canonical, joint multi-bin):\n",
+    "```bash\n",
+    "sos run pipeline/functional_fine_mapping.ipynb functional_fine_mapping \\\n",
+    "    --cwd output \\\n",
+    "    --modular-script-dir /path/to/xqtl-protocol/code/script \\\n",
+    "    --study TEST_STUDY \\\n",
+    "    --qtl-dataset output/TEST_STUDY.qtl_dataset.rds \\\n",
+    "    --regions chr7:139293693-145380632\n",
+    "```\n",
+    "\n",
+    "With fSuSiE prior + max-iter overrides:\n",
+    "```bash\n",
+    "sos run pipeline/functional_fine_mapping.ipynb functional_fine_mapping \\\n",
+    "    --cwd output --study TEST_STUDY \\\n",
+    "    --qtl-dataset output/TEST_STUDY.qtl_dataset.rds \\\n",
+    "    --regions chr7:139293693-145380632 \\\n",
+    "    --method-args '{\"signalCutoff\":0.01}'\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[global]\n",
+    "parameter: cwd = path('output')\n",
+    "parameter: study = str\n",
+    "parameter: modular_script_dir = path('code/script')\n",
+    "parameter: qtl_dataset = path('.')\n",
+    "parameter: genes = []\n",
+    "parameter: regions = []\n",
+    "parameter: cis_window = 1000000\n",
+    "parameter: methods = 'fsusie'\n",
+    "parameter: coverage = 0.95\n",
+    "# JSON object spliced into fineMappingPipeline() via do.call. Empty disables.\n",
+    "parameter: method_args = ''\n",
+    "parameter: container = ''\n",
+    "parameter: job_size = 1\n",
+    "parameter: walltime = '1h'\n",
+    "parameter: mem = '16G'\n",
+    "parameter: numThreads = 1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[functional_fine_mapping]\n",
+    "if not qtl_dataset.is_file():\n",
+    "    raise ValueError(\"functional_fine_mapping requires --qtl-dataset to point at an existing QtlDataset RDS.\")\n",
+    "if bool(genes) == bool(regions):\n",
+    "    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\n",
+    "fanout_items = genes if genes else regions\n",
+    "fanout_kind  = 'gene' if genes else 'region'\n",
+    "input: qtl_dataset, for_each = 'fanout_items'\n",
+    "output: f\"{cwd}/{study}.{_fanout_items.replace(':', '_').replace('-', '_')}.fn_finemap.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping.R \\\n",
+    "        --qtl-dataset ${_input} \\\n",
+    "        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n",
+    "        --methods ${methods} \\\n",
+    "        --coverage ${coverage} \\\n",
+    "        ${('--method-args ' + repr(method_args)) if method_args else ''} \\\n",
+    "        --output ${_output}"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "SoS",
+   "language": "sos",
+   "name": "sos"
+  },
+  "language_info": {
+   "codemirror_mode": "sos",
+   "file_extension": ".sos",
+   "mimetype": "text/x-sos",
+   "name": "sos",
+   "pygments_lexer": "python",
+   "sos": {
+    "kernels": [
+     [
+      "SoS",
+      "sos",
+      "",
+      ""
+     ]
+    ],
+    "version": "0.22.4"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
\ No newline at end of file
diff --git a/code/SoS/pecotmr_integration/gwas_rss_fine_mapping.ipynb b/code/SoS/pecotmr_integration/gwas_rss_fine_mapping.ipynb
new file mode 100644
index 00000000..83f397da
--- /dev/null
+++ b/code/SoS/pecotmr_integration/gwas_rss_fine_mapping.ipynb
@@ -0,0 +1,251 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# GWAS RSS fine-mapping (per study \u00d7 per LD block)\n",
+    "\n",
+    "## Description\n",
+    "\n",
+    "End-to-end SuSiE-RSS fine-mapping over GWAS summary statistics. Four SoS steps:\n",
+    "\n",
+    "1. **`[generate_manifest]`** &mdash; calls `gwas_rss_manifest.R` to resolve `--gwas-meta` + `--gwas-tsv-list` against `--region-list` + `--regions` into a single TSV manifest (one row per study x region job). All cross-product / dedup / sanitisation logic lives in R.\n",
+    "2. **`[generate_gwas_sumstats]`** &mdash; per (study \u00d7 LD block) fan-out. Calls `gwas_sumstats_construct.R` to build one `GwasSumStats` RDS per study \u00d7 block, with `summaryStatsQc()` (optionally including SLALOM/DENTIST QC and RAISS imputation).\n",
+    "3. **`[gwas_fine_mapping]`** &mdash; per RDS, runs `fine_mapping.R --gwas-sumstats` which dispatches to `pecotmr::fineMappingPipeline(gwasSumStats, methods = ...)` (default `susie`; the SuSiE-RSS variant inside pecotmr).\n",
+    "4. **`[gwas_rss_plot]`** (optional, opt-out via `--no-plot`) &mdash; per fine-mapping RDS, renders a PIP plot via `gwas_rss_plot.R`.\n",
+    "\n",
+    "Replaces the legacy `mnm_analysis/mnm_methods/rss_analysis.ipynb` two-step workflow. Reuses `fine_mapping.R` (no new fine-mapping worker required) and the new `--method-args` / `--qc-args` JSON passthroughs for per-method and per-QC tuning.\n",
+    "\n",
+    "## Inputs\n",
+    "\n",
+    "### Per-study inputs (one or both forms; the workflow fans out over their union)\n",
+    "\n",
+    "- `--gwas-meta` &mdash; TSV with columns `study_id`, `path` (relative to the meta file or absolute), and optionally `column_mapping` (path to a per-study YAML; see `gwas_sumstats_construct.R --column-mapping`). One row per study.\n",
+    "- `--gwas-tsv-list S1=path1 S2=path2 ...` &mdash; explicit STUDY=PATH pairs (alternative to / additional to `--gwas-meta`).\n",
+    "\n",
+    "### Region inputs (one or both; their union is the fan-out target)\n",
+    "\n",
+    "- `--region-list` &mdash; BED-like TSV (`#chr`, `start`, `end`, ...) listing LD blocks.\n",
+    "- `--regions chr:start-end ...` &mdash; explicit region strings.\n",
+    "\n",
+    "### LD reference\n",
+    "\n",
+    "- `--ld-meta` &mdash; LD-reference meta TSV (`#chr`, `start`, `end`, `path`). One LD block per row, or one row with `start = end = 0` to mean &ldquo;whole chromosome.&rdquo;\n",
+    "\n",
+    "### QC knobs (forwarded to `gwas_sumstats_construct.R`)\n",
+    "\n",
+    "- `--qc-method` &mdash; `none` (default), `slalom`, or `dentist`.\n",
+    "- `--impute` &mdash; flag to enable RAISS sumstat imputation.\n",
+    "- `--qc-args '{\"key\":value, ...}'` &mdash; JSON passthrough for any other `summaryStatsQc()` kwarg.\n",
+    "\n",
+    "### Fine-mapping knobs (forwarded to `fine_mapping.R --gwas-sumstats`)\n",
+    "\n",
+    "- `--methods` &mdash; comma-separated method tokens. Default `susie` (= SuSiE-RSS internally for `GwasSumStats` input).\n",
+    "- `--coverage` &mdash; SuSiE credible-set coverage. Default 0.95.\n",
+    "- `--method-args '{\"susie\":{\"L\":1}}'` &mdash; per-method kwargs spliced into `fineMappingPipeline()`.\n",
+    "\n",
+    "### Infrastructure\n",
+    "\n",
+    "- `--cwd` &mdash; output directory (default `output`).\n",
+    "- `--modular-script-dir` &mdash; directory holding the per-task R workers (default `code/script`).\n",
+    "- `--genome` &mdash; genome build label (default `GRCh38`).\n",
+    "- `--no-plot` &mdash; skip the PIP plot step.\n",
+    "\n",
+    "## Outputs\n",
+    "\n",
+    "- `{cwd}/sumstats/{study}.{chrN_S_E}.gwas_sumstats.rds` &mdash; one per (study, LD block).\n",
+    "- `{cwd}/fine_mapping/{study}.{chrN_S_E}.gwas_finemap.rds` &mdash; one `GwasFineMappingResult` per (study, LD block).\n",
+    "- `{cwd}/plots/{study}.{chrN_S_E}.pip_plot.png` &mdash; per (study, LD block) PIP plot (when `--no-plot` is unset).\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example\n",
+    "\n",
+    "Minimal run with explicit inputs:\n",
+    "```bash\n",
+    "sos run pipeline/gwas_rss_fine_mapping.ipynb gwas_rss_fine_mapping \\\n",
+    "    --cwd output --modular-script-dir /path/to/xqtl-protocol/code/script \\\n",
+    "    --gwas-tsv-list MyGWAS=input/twas/protocol_example.twas.gwas_sumstats.chr22.tsv.gz \\\n",
+    "    --regions chr22:10516173-17414263 \\\n",
+    "    --ld-meta input/rss_analysis/protocol_example.ld_meta.tsv\n",
+    "```\n",
+    "\n",
+    "With SLALOM QC + RAISS imputation + per-method susie tuning:\n",
+    "```bash\n",
+    "sos run pipeline/gwas_rss_fine_mapping.ipynb gwas_rss_fine_mapping \\\n",
+    "    --cwd output --modular-script-dir /path/to/code/script \\\n",
+    "    --gwas-meta input/gwas/gwas_meta.tsv \\\n",
+    "    --region-list input/rss_analysis/ld_blocks.bed \\\n",
+    "    --ld-meta input/rss_analysis/ld_meta.tsv \\\n",
+    "    --qc-method slalom --impute --qc-args '{\"mafCutoff\":0.01}' \\\n",
+    "    --methods susie --method-args '{\"susie\":{\"L\":10}}'\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[global]\n",
+    "parameter: cwd = path('output')\n",
+    "parameter: modular_script_dir = path('code/script')\n",
+    "# --- per-study GWAS inputs (use either or both) ---------------------\n",
+    "parameter: gwas_meta = path('.')\n",
+    "parameter: gwas_tsv_list = []   # list of STUDY=PATH items\n",
+    "# --- region inputs (use either or both) -----------------------------\n",
+    "parameter: region_list = path('.')\n",
+    "parameter: regions = []         # list of chr:start-end strings\n",
+    "# --- LD reference + genome ------------------------------------------\n",
+    "parameter: ld_meta = path\n",
+    "parameter: genome = 'GRCh38'\n",
+    "# --- QC knobs (forwarded to gwas_sumstats_construct.R) --------------\n",
+    "parameter: qc_method = 'none'   # none | slalom | dentist\n",
+    "parameter: impute = False\n",
+    "parameter: qc_args = ''         # JSON object spliced into summaryStatsQc()\n",
+    "# --- fine-mapping knobs (forwarded to fine_mapping.R) ---------------\n",
+    "parameter: methods = 'susie'\n",
+    "parameter: coverage = 0.95\n",
+    "parameter: method_args = ''     # nested per-method JSON object\n",
+    "# --- plot opt-out ---------------------------------------------------\n",
+    "parameter: no_plot = False\n",
+    "# --- output prefix for the manifest file ----------------------------\n",
+    "parameter: manifest_name = 'gwas_rss'\n",
+    "# --- infrastructure -------------------------------------------------\n",
+    "parameter: container = ''\n",
+    "parameter: job_size = 1\n",
+    "parameter: walltime = '2h'\n",
+    "parameter: mem = '16G'\n",
+    "parameter: numThreads = 1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[generate_manifest]\n",
+    "# Resolve --gwas-meta / --gwas-tsv-list x --region-list / --regions into a\n",
+    "# single manifest TSV via gwas_rss_manifest.R. Downstream steps fan out\n",
+    "# over its rows; no Python parsing in this notebook.\n",
+    "input: None\n",
+    "output: f\"{cwd}/{manifest_name}.manifest.tsv\"\n",
+    "task: trunk_workers = 1, trunk_size = 1, walltime = '15m', mem = '2G', cores = 1, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/gwas_rss_manifest.R \\\n",
+    "        ${('--gwas-meta ' + str(gwas_meta)) if gwas_meta.is_file() else ''} \\\n",
+    "        ${('--gwas-tsv-list ' + ' '.join(str(x) for x in gwas_tsv_list)) if gwas_tsv_list else ''} \\\n",
+    "        ${('--region-list ' + str(region_list)) if region_list.is_file() else ''} \\\n",
+    "        ${('--regions ' + ' '.join(str(x) for x in regions)) if regions else ''} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[generate_gwas_sumstats]\n",
+    "# Fan out over the manifest's rows: one (study, region) GwasSumStats per\n",
+    "# row. Manifest columns: study_id, gwas_tsv, column_mapping, chr, start,\n",
+    "# end, region_id, gwas_tsv_basename.\n",
+    "import csv\n",
+    "jobs = list(csv.DictReader(open(f\"{cwd}/{manifest_name}.manifest.tsv\"), delimiter='\\t'))\n",
+    "input: for_each = 'jobs'\n",
+    "output: f\"{cwd}/sumstats/{_jobs['study_id']}.{_jobs['region_id']}.gwas_sumstats.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/gwas_sumstats_construct.R \\\n",
+    "        --study ${_jobs['study_id']} \\\n",
+    "        --gwas-tsv ${_jobs['gwas_tsv']} \\\n",
+    "        --ld-block ${_jobs['chr']}:${_jobs['start']}-${_jobs['end']} \\\n",
+    "        --ld-meta ${ld_meta} \\\n",
+    "        --genome ${genome} \\\n",
+    "        ${('--column-mapping ' + _jobs['column_mapping']) if _jobs['column_mapping'] else ''} \\\n",
+    "        --qc-method ${qc_method} \\\n",
+    "        ${'--impute' if impute else ''} \\\n",
+    "        ${('--qc-args ' + repr(qc_args)) if qc_args else ''} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[gwas_fine_mapping]\n",
+    "# Per-RDS fan-out: one fine-mapping task per (study, region) GwasSumStats.\n",
+    "output: f\"{cwd}/fine_mapping/{_input:bnn}.gwas_finemap.rds\", group_by = 1\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping.R \\\n",
+    "        --gwas-sumstats ${_input} \\\n",
+    "        --methods ${methods} \\\n",
+    "        --coverage ${coverage} \\\n",
+    "        ${('--method-args ' + repr(method_args)) if method_args else ''} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[gwas_rss_plot]\n",
+    "stop_if(no_plot, '--no-plot set; skipping PIP plot step.')\n",
+    "output: f\"{cwd}/plots/{_input:bnn}.pip_plot.png\", group_by = 1\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = '30m', mem = '4G', cores = 1, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/gwas_rss_plot.R \\\n",
+    "        --input ${_input} \\\n",
+    "        --output ${_output}"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "SoS",
+   "language": "sos",
+   "name": "sos"
+  },
+  "language_info": {
+   "codemirror_mode": "sos",
+   "file_extension": ".sos",
+   "mimetype": "text/x-sos",
+   "name": "sos",
+   "pygments_lexer": "python",
+   "sos": {
+    "kernels": [
+     [
+      "SoS",
+      "sos",
+      "",
+      ""
+     ]
+    ],
+    "version": "0.22.4"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
\ No newline at end of file
diff --git a/code/SoS/pecotmr_integration/gwas_sumstats.ipynb b/code/SoS/pecotmr_integration/gwas_sumstats.ipynb
index 9b908b01..cc206ac6 100644
--- a/code/SoS/pecotmr_integration/gwas_sumstats.ipynb
+++ b/code/SoS/pecotmr_integration/gwas_sumstats.ipynb
@@ -26,7 +26,18 @@
    },
    "outputs": [],
    "source": [
-    "[global]\nparameter: cwd = path('output')\nparameter: study = str\nparameter: gwas_tsv = path\nparameter: ld_blocks = path\nparameter: ld_meta = path\nparameter: genome = 'hg19'\nparameter: modular_script_dir = path('code/script')\nparameter: container = ''\nparameter: walltime = '30m'\nparameter: mem = '8G'\nparameter: numThreads = 1\n"
+    "[global]\n",
+    "parameter: cwd = path('output')\n",
+    "parameter: study = str\n",
+    "parameter: gwas_tsv = path\n",
+    "parameter: ld_blocks = path\n",
+    "parameter: ld_meta = path\n",
+    "parameter: genome = 'hg19'\n",
+    "parameter: modular_script_dir = path('code/script')\n",
+    "parameter: container = ''\n",
+    "parameter: walltime = '30m'\n",
+    "parameter: mem = '8G'\n",
+    "parameter: numThreads = 1"
    ]
   },
   {
@@ -37,7 +48,42 @@
    },
    "outputs": [],
    "source": [
-    "[gwas_sumstats_construct]\n# Parse LD-blocks BED (header line uses 'chr/chrom', 'start', 'stop/end').\ndef _read_blocks(path):\n    out = []\n    with open(path) as f:\n        hdr = next(f).rstrip('\\n').lstrip('#').split('\\t')\n        for line in f:\n            row = dict(zip(hdr, line.rstrip('\\n').split('\\t')))\n            chrom = row.get('chr', row.get('chrom'))\n            start = int(row.get('start', row.get('Start', 0)))\n            stop  = int(row.get('stop', row.get('end', row.get('End', 0))))\n            out.append(f'{chrom}:{start}-{stop}')\n    return out\nblocks = _read_blocks(ld_blocks)\nif not blocks:\n    raise ValueError(f\"No LD blocks found in {ld_blocks}\")\ndef _safe(s):\n    return s.replace(':', '_').replace('-', '_')\ninput: gwas_tsv, for_each = 'blocks'\noutput: f\"{cwd}/{study}.{_safe(_blocks)}.gwas_sumstats.rds\"\ntask: trunk_workers = 1, trunk_size = 1, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\nbash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n    Rscript ${modular_script_dir}/pecotmr_integration/gwas_sumstats_construct.R \\\n        --study ${study} \\\n        --gwas-tsv ${_input} \\\n        --ld-block ${_blocks} \\\n        --ld-meta ${ld_meta} \\\n        --genome ${genome} \\\n        --output ${_output}\n"
+    "[generate_manifest]\n",
+    "# Parse the LD-blocks BED into a (region, region_id) TSV manifest via\n",
+    "# ld_blocks_to_manifest.R. Downstream fans out over its rows; no Python\n",
+    "# parsing in this notebook.\n",
+    "input: ld_blocks\n",
+    "output: f\"{cwd}/{study}.blocks_manifest.tsv\"\n",
+    "task: trunk_workers = 1, trunk_size = 1, walltime = '15m', mem = '2G', cores = 1, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/ld_blocks_to_manifest.R \\\n",
+    "        --ld-blocks ${_input} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[gwas_sumstats_construct]\n",
+    "# Fan out over the manifest's rows: one GwasSumStats per LD block.\n",
+    "import csv\n",
+    "jobs = list(csv.DictReader(open(f\"{cwd}/{study}.blocks_manifest.tsv\"), delimiter='\\t'))\n",
+    "input: gwas_tsv, for_each = 'jobs'\n",
+    "output: f\"{cwd}/{study}.{_jobs['region_id']}.gwas_sumstats.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = 1, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/gwas_sumstats_construct.R \\\n",
+    "        --study ${study} \\\n",
+    "        --gwas-tsv ${_input} \\\n",
+    "        --ld-block ${_jobs['region']} \\\n",
+    "        --ld-meta ${ld_meta} \\\n",
+    "        --genome ${genome} \\\n",
+    "        --output ${_output}"
    ]
   }
  ],
diff --git a/code/SoS/pecotmr_integration/mash_preprocessing.ipynb b/code/SoS/pecotmr_integration/mash_preprocessing.ipynb
new file mode 100644
index 00000000..d57d0133
--- /dev/null
+++ b/code/SoS/pecotmr_integration/mash_preprocessing.ipynb
@@ -0,0 +1,190 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# MASH preprocessing\n",
+    "\n",
+    "## Description\n",
+    "\n",
+    "Build MASH (Multivariate Adaptive Shrinkage) inputs from multi-context tensorqtl summary statistics and estimate the mixture-component covariance + weights via `pecotmr::mashPipeline()`. Three SoS steps:\n",
+    "\n",
+    "- **`[generate_manifest]`** &mdash; calls `mash_manifest.R` to resolve `--region-file` + `--sum-files` + `--conditions` into a single per-region manifest TSV (one row per region, with chrom-matched per-condition tensorqtl paths already resolved). All region parsing / per-chrom matching lives in R.\n",
+    "- **`[mash_inputs_from_tensorqtl]`** &mdash; per-region fan-out. For each region in `--region-file`, calls `mash_sumstats_construct.R` to read the per-condition tensorqtl files, inner-join on `variant_id`, and write a per-region `list(region_id = list(z = <variants x conditions matrix>))` RDS.\n",
+    "- **`[mash_estimation]`** &mdash; concatenates every per-region RDS, runs `pecotmr::mashRandNullSample` to derive strong / random / null subsets, wraps each partition as a per-context `QtlSumStats`, and calls `pecotmr::mashPipeline()`. The output RDS carries the MASH `U` (covariance matrices) and `w` (mixture weights).\n",
+    "\n",
+    "Replaces the workflow in `multivariate_genome/MASH/mash_preprocessing.ipynb` (which depended on the now-deprecated `load_multitrait_*_sumstat` loaders). The legacy notebook is left in place; new work should use this one.\n",
+    "\n",
+    "## Inputs\n",
+    "\n",
+    "- `--region-file` &mdash; TSV with columns `chr start end gene_id` listing the LD-block regions to process.\n",
+    "- `--sum-files file1 [file2 ...]` &mdash; text files, one per condition; each contains one tensorqtl summary-statistics path per line. The notebook assumes the per-chromosome layout already used by `mash_ran_null_sample` callers (path filename embeds the chromosome, matching the legacy `.CHR.` convention).\n",
+    "- `--conditions c1,c2,...` &mdash; condition labels, one per `--sum-files` entry (same order).\n",
+    "- `--ld-sketch <RDS>` &mdash; `GenotypeHandle` RDS (required by `mashPipeline()`'s QC gate; the synthesised `QtlSumStats` carries it through `summaryStatsQc()`).\n",
+    "- `--n-random / --n-null` &mdash; subset sizes for `mashRandNullSample`. Defaults 4000 / 4000.\n",
+    "- `--exclude-condition c1,c2` &mdash; conditions to drop before MASH. Default none.\n",
+    "- `--alpha` &mdash; alpha forwarded to `mashPipeline()`. Default 0 (standard scale).\n",
+    "- `--seed` &mdash; RNG seed. Default 999.\n",
+    "- `--name` &mdash; output filename prefix.\n",
+    "\n",
+    "## Outputs\n",
+    "\n",
+    "- `{cwd}/{name}_cache/{name}.{gene_id}.mash_input.rds` &mdash; one per region.\n",
+    "- `{cwd}/{name}.mash_result.rds` &mdash; final `mashPipeline()` output (`list(U, w)`).\n",
+    "\n",
+    "## MWE note\n",
+    "\n",
+    "The legacy fixture in `input/finemapping/protocol_example.sumstats_db.rds` was custom-massaged; the new pipeline expects standard per-condition tensorqtl outputs plus a `GenotypeHandle` RDS for the LD reference. Provide both before running."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example\n",
+    "\n",
+    "```bash\n",
+    "sos run pipeline/mash_preprocessing.ipynb \\\n",
+    "    mash_inputs_from_tensorqtl+mash_estimation \\\n",
+    "    --cwd output/mash --modular-script-dir /path/to/code/script \\\n",
+    "    --name protocol_example_mash \\\n",
+    "    --region-file input/finemapping/protocol_example.region \\\n",
+    "    --sum-files input/mash/condA.txt input/mash/condB.txt \\\n",
+    "    --conditions condA,condB \\\n",
+    "    --ld-sketch input/ld/protocol_example.ld_sketch.rds \\\n",
+    "    --n-random 4000 --n-null 4000 \\\n",
+    "    --alpha 0 --seed 999\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[global]\n",
+    "parameter: cwd = path('output')\n",
+    "parameter: modular_script_dir = path('code/script')\n",
+    "parameter: name = str\n",
+    "# --- per-region input ----------------------------------------------\n",
+    "parameter: region_file = path\n",
+    "parameter: sum_files = paths     # one txt per condition; each line is one tensorqtl file path\n",
+    "parameter: conditions = ''       # comma-separated condition labels\n",
+    "# --- mash estimation knobs ------------------------------------------\n",
+    "parameter: ld_sketch = path\n",
+    "parameter: n_random = 4000\n",
+    "parameter: n_null   = 4000\n",
+    "parameter: exclude_condition = ''  # comma-separated\n",
+    "parameter: alpha = 0.0\n",
+    "parameter: seed  = 999\n",
+    "# --- infrastructure -------------------------------------------------\n",
+    "parameter: container = ''\n",
+    "parameter: job_size = 1\n",
+    "parameter: walltime = '2h'\n",
+    "parameter: mem = '16G'\n",
+    "parameter: numThreads = 1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[generate_manifest]\n",
+    "# Resolve --region-file x --sum-files x --conditions into one manifest\n",
+    "# TSV via mash_manifest.R. Downstream steps fan out over its rows; no\n",
+    "# Python parsing in this notebook.\n",
+    "input: None\n",
+    "output: f\"{cwd}/{name}.mash_manifest.tsv\"\n",
+    "task: trunk_workers = 1, trunk_size = 1, walltime = '15m', mem = '2G', cores = 1, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/mash_manifest.R \\\n",
+    "        --region-file ${region_file} \\\n",
+    "        --sum-files ${' '.join(str(x) for x in sum_files)} \\\n",
+    "        --conditions ${conditions} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[mash_inputs_from_tensorqtl]\n",
+    "# Fan out over the manifest's rows; one per-region task each.\n",
+    "import csv\n",
+    "jobs = list(csv.DictReader(open(f\"{cwd}/{name}.mash_manifest.tsv\"), delimiter='\\t'))\n",
+    "input: for_each = 'jobs'\n",
+    "output: f\"{cwd}/{name}_cache/{name}.{_jobs['gene_id']}.mash_input.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/mash_sumstats_construct.R \\\n",
+    "        --tensorqtl-paths ${_jobs['tensorqtl_paths']} \\\n",
+    "        --conditions ${conditions} \\\n",
+    "        --region ${_jobs['region']} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[mash_estimation]\n",
+    "output: f\"{cwd}/{name}.mash_result.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/mash.R \\\n",
+    "        --mash-inputs ${_input} \\\n",
+    "        --study ${name} \\\n",
+    "        --ld-sketch ${ld_sketch} \\\n",
+    "        --n-random ${n_random} \\\n",
+    "        --n-null ${n_null} \\\n",
+    "        ${('--exclude-condition ' + exclude_condition) if exclude_condition else ''} \\\n",
+    "        --alpha ${alpha} \\\n",
+    "        --seed ${seed} \\\n",
+    "        --output ${_output}"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "SoS",
+   "language": "sos",
+   "name": "sos"
+  },
+  "language_info": {
+   "codemirror_mode": "sos",
+   "file_extension": ".sos",
+   "mimetype": "text/x-sos",
+   "name": "sos",
+   "pygments_lexer": "python",
+   "sos": {
+    "kernels": [
+     [
+      "SoS",
+      "sos",
+      "",
+      ""
+     ]
+    ],
+    "version": "0.22.4"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
\ No newline at end of file
diff --git a/code/SoS/pecotmr_integration/multivariate_fine_mapping.ipynb b/code/SoS/pecotmr_integration/multivariate_fine_mapping.ipynb
new file mode 100644
index 00000000..194261c5
--- /dev/null
+++ b/code/SoS/pecotmr_integration/multivariate_fine_mapping.ipynb
@@ -0,0 +1,143 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Multivariate fine-mapping (mvSuSiE)\n",
+    "\n",
+    "## Description\n",
+    "\n",
+    "Per-region multivariate fine-mapping via `pecotmr::fineMappingPipeline(qtlDataset, methods = \"mvsusie\")`. mvSuSiE jointly fine-maps across the QtlDataset's traits and contexts in one region, producing one `QtlFineMappingResult` per fan-out unit. Each task loads the same `QtlDataset` RDS and fits one unit.\n",
+    "\n",
+    "This replaces the `[mnm]` and `[mnm_genes]` steps of the legacy `mnm_regression.ipynb`. The same `fine_mapping.R` worker that drives univariate SuSiE drives mvSuSiE here \u2014 only the `--methods` value differs.\n",
+    "\n",
+    "For mvSuSiE TWAS weights (mr.mash + mvSuSiE), call `twas_weights.ipynb --methods mrmash,mvsusie` afterwards against the same `QtlDataset`.\n",
+    "\n",
+    "## Inputs\n",
+    "\n",
+    "- `--qtl-dataset` &mdash; path to the `QtlDataset` RDS produced by `qtl_dataset.ipynb`. Multi-context / multi-trait shape is what makes mvSuSiE meaningful.\n",
+    "- `--regions chr:start-end ...` **or** `--genes ID1 ID2 ...` &mdash; fan-out targets (mutually exclusive). Region mode (multi-trait joint) is the standard use; gene mode runs the multi-context mvSuSiE variant for one focal trait.\n",
+    "- `--cis-window` &mdash; bp window (gene mode only). Default 1,000,000.\n",
+    "- `--methods` &mdash; comma-separated method tokens. Default `mvsusie`. Pass e.g. `mvsusie,susie` to also run univariate SuSiE alongside.\n",
+    "- `--coverage` &mdash; SuSiE/mvSuSiE credible-set coverage. Default 0.95.\n",
+    "- `--method-args` &mdash; optional JSON object spliced into `fineMappingPipeline()` for knobs not exposed as flags (priors, max-iter, residual options, etc.).\n",
+    "- `--cwd` &mdash; output directory. Default `output`.\n",
+    "- `--study` &mdash; study label used in the output filename.\n",
+    "- `--modular-script-dir` &mdash; directory holding the per-task R workers. Default `code/script`.\n",
+    "\n",
+    "## Outputs\n",
+    "\n",
+    "- `{cwd}/{study}.{gene|region}.mv_finemap.rds` &mdash; one `QtlFineMappingResult` per fan-out unit. Region strings are sanitised (`:` and `-` become `_`).\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example\n",
+    "\n",
+    "Region mode (canonical, multi-trait joint):\n",
+    "```bash\n",
+    "sos run pipeline/multivariate_fine_mapping.ipynb multivariate_fine_mapping \\\n",
+    "    --cwd output \\\n",
+    "    --modular-script-dir /path/to/xqtl-protocol/code/script \\\n",
+    "    --study TEST_STUDY \\\n",
+    "    --qtl-dataset output/TEST_STUDY.qtl_dataset.rds \\\n",
+    "    --regions chr12:752578-2752578\n",
+    "```\n",
+    "\n",
+    "With mixture-prior tuning passed through:\n",
+    "```bash\n",
+    "sos run pipeline/multivariate_fine_mapping.ipynb multivariate_fine_mapping \\\n",
+    "    --cwd output --study TEST_STUDY \\\n",
+    "    --qtl-dataset output/TEST_STUDY.qtl_dataset.rds \\\n",
+    "    --regions chr12:752578-2752578 \\\n",
+    "    --method-args '{\"addSusieInf\":false,\"signalCutoff\":0.01}'\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[global]\n",
+    "parameter: cwd = path('output')\n",
+    "parameter: study = str\n",
+    "parameter: modular_script_dir = path('code/script')\n",
+    "parameter: qtl_dataset = path('.')\n",
+    "parameter: genes = []\n",
+    "parameter: regions = []\n",
+    "parameter: cis_window = 1000000\n",
+    "parameter: methods = 'mvsusie'\n",
+    "parameter: coverage = 0.95\n",
+    "# JSON object spliced into fineMappingPipeline() via do.call. Empty disables.\n",
+    "parameter: method_args = ''\n",
+    "parameter: container = ''\n",
+    "parameter: job_size = 1\n",
+    "parameter: walltime = '1h'\n",
+    "parameter: mem = '16G'\n",
+    "parameter: numThreads = 1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[multivariate_fine_mapping]\n",
+    "if not qtl_dataset.is_file():\n",
+    "    raise ValueError(\"multivariate_fine_mapping requires --qtl-dataset to point at an existing QtlDataset RDS.\")\n",
+    "if bool(genes) == bool(regions):\n",
+    "    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\n",
+    "fanout_items = genes if genes else regions\n",
+    "fanout_kind  = 'gene' if genes else 'region'\n",
+    "input: qtl_dataset, for_each = 'fanout_items'\n",
+    "output: f\"{cwd}/{study}.{_fanout_items.replace(':', '_').replace('-', '_')}.mv_finemap.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/fine_mapping.R \\\n",
+    "        --qtl-dataset ${_input} \\\n",
+    "        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n",
+    "        --methods ${methods} \\\n",
+    "        --coverage ${coverage} \\\n",
+    "        ${('--method-args ' + repr(method_args)) if method_args else ''} \\\n",
+    "        --output ${_output}"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "SoS",
+   "language": "sos",
+   "name": "sos"
+  },
+  "language_info": {
+   "codemirror_mode": "sos",
+   "file_extension": ".sos",
+   "mimetype": "text/x-sos",
+   "name": "sos",
+   "pygments_lexer": "python",
+   "sos": {
+    "kernels": [
+     [
+      "SoS",
+      "sos",
+      "",
+      ""
+     ]
+    ],
+    "version": "0.22.4"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
\ No newline at end of file
diff --git a/code/SoS/pecotmr_integration/twas.ipynb b/code/SoS/pecotmr_integration/twas.ipynb
index 9e0b0871..34b16efe 100644
--- a/code/SoS/pecotmr_integration/twas.ipynb
+++ b/code/SoS/pecotmr_integration/twas.ipynb
@@ -37,7 +37,44 @@
    },
    "outputs": [],
    "source": [
-    "[twas]\n# Read the per-gene manifest. Header may be canonical (`gene_id`,\n# `twas_weights_rds`, `gwas_sumstats_rds`, `fine_mapping_result_rds`) and\n# may carry a leading `#`. fine_mapping_result_rds is optional per-row\n# (empty cell = skip).\ndef _read_manifest(path):\n    rows = []\n    with open(path) as f:\n        hdr = next(f).rstrip('\\n').lstrip('#').split('\\t')\n        required = {'gene_id', 'twas_weights_rds', 'gwas_sumstats_rds'}\n        missing = required - set(hdr)\n        if missing:\n            raise ValueError(\n                f\"Manifest missing required column(s): {sorted(missing)}\")\n        for line in f:\n            parts = line.rstrip('\\n').split('\\t')\n            row = dict(zip(hdr, parts))\n            if not row.get('gene_id'):\n                continue\n            rows.append(row)\n    if not rows:\n        raise ValueError(f\"Manifest {path} is empty.\")\n    ids = [r['gene_id'] for r in rows]\n    if len(set(ids)) != len(ids):\n        raise ValueError(\n            \"Manifest has duplicate gene_id values; each row must be unique.\")\n    return rows\nrows = _read_manifest(manifest)\ngene_ids   = [r['gene_id'] for r in rows]\ntw_paths   = [r['twas_weights_rds'] for r in rows]\ngss_paths  = [r['gwas_sumstats_rds'] for r in rows]\nfmr_paths  = [r.get('fine_mapping_result_rds', '') or '' for r in rows]\ninput: for_each = ['gene_ids', 'tw_paths', 'gss_paths', 'fmr_paths']\nfmr_arg = f\"--fine-mapping-result {_fmr_paths}\" if _fmr_paths else \"\"\noutput: f\"{cwd}/{study}.{_gene_ids}.twas.rds\"\ntask: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\nbash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n    Rscript ${modular_script_dir}/pecotmr_integration/twas.R \\\n        --twas-weights ${_tw_paths} \\\n        --gwas-sumstats ${_gss_paths} \\\n        ${fmr_arg} \\\n        --mr-pip-cutoff ${mr_pip_cutoff} \\\n        --mr-method ${mr_method} \\\n        --output ${_output}\n"
+    "[validate_manifest]\n",
+    "# Validate + canonicalise the user-supplied per-gene TWAS manifest via\n",
+    "# twas_manifest_validate.R. Downstream fans out over its rows; no\n",
+    "# Python parsing in this notebook.\n",
+    "input: manifest\n",
+    "output: f\"{cwd}/{study}.twas_manifest.tsv\"\n",
+    "task: trunk_workers = 1, trunk_size = 1, walltime = '15m', mem = '2G', cores = 1, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/twas_manifest_validate.R \\\n",
+    "        --manifest ${_input} \\\n",
+    "        --output ${_output}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "kernel": "SoS"
+   },
+   "outputs": [],
+   "source": [
+    "[twas]\n",
+    "# Fan out over the canonical manifest's rows: one task per gene_id.\n",
+    "# Manifest columns: gene_id, twas_weights_rds, gwas_sumstats_rds, and\n",
+    "# optionally fine_mapping_result_rds.\n",
+    "import csv\n",
+    "jobs = list(csv.DictReader(open(f\"{cwd}/{study}.twas_manifest.tsv\"), delimiter='\\t'))\n",
+    "input: for_each = 'jobs'\n",
+    "output: f\"{cwd}/{study}.{_jobs['gene_id']}.twas.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/twas.R \\\n",
+    "        --twas-weights ${_jobs['twas_weights_rds']} \\\n",
+    "        --gwas-sumstats ${_jobs['gwas_sumstats_rds']} \\\n",
+    "        ${('--fine-mapping-result ' + _jobs.get('fine_mapping_result_rds', '')) if _jobs.get('fine_mapping_result_rds') else ''} \\\n",
+    "        --mr-pip-cutoff ${mr_pip_cutoff} \\\n",
+    "        --mr-method ${mr_method} \\\n",
+    "        --output ${_output}"
    ]
   }
  ],
@@ -77,4 +114,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
\ No newline at end of file
diff --git a/code/SoS/pecotmr_integration/twas_weights.ipynb b/code/SoS/pecotmr_integration/twas_weights.ipynb
index c723922c..d9eb4990 100644
--- a/code/SoS/pecotmr_integration/twas_weights.ipynb
+++ b/code/SoS/pecotmr_integration/twas_weights.ipynb
@@ -37,7 +37,20 @@
    },
    "outputs": [],
    "source": [
-    "[twas_weights]\nif bool(genes) == bool(regions):\n    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\nfanout_items = genes if genes else regions\nfanout_kind  = 'gene' if genes else 'region'\ndef _fanout_safe(s):\n    return s.replace(':', '_').replace('-', '_')\ninput: qtl_dataset, for_each = 'fanout_items'\noutput: f\"{cwd}/{study}.{_fanout_safe(_fanout_items)}.twas_weights.rds\"\ntask: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\nbash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n    Rscript ${modular_script_dir}/pecotmr_integration/twas_weights.R \\\n        --qtl-dataset ${_input} \\\n        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n        --fine-mapping-result ${fine_mapping_result if fine_mapping_result.is_file() else '\"\"'} \\\n        --output ${_output}\n"
+    "[twas_weights]\n",
+    "if bool(genes) == bool(regions):\n",
+    "    raise ValueError(\"Specify exactly one of --genes (trait IDs) or --regions (chr:start-end strings).\")\n",
+    "fanout_items = genes if genes else regions\n",
+    "fanout_kind  = 'gene' if genes else 'region'\n",
+    "input: qtl_dataset, for_each = 'fanout_items'\n",
+    "output: f\"{cwd}/{study}.{_fanout_items.replace(':', '_').replace('-', '_')}.twas_weights.rds\"\n",
+    "task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f\"{step_name}_{_output:bn}\"\n",
+    "bash: expand = '${ }', stderr = f\"{_output}.stderr\", stdout = f\"{_output}.stdout\", container = container\n",
+    "    Rscript ${modular_script_dir}/pecotmr_integration/twas_weights.R \\\n",
+    "        --qtl-dataset ${_input} \\\n",
+    "        ${('--gene-id ' + _fanout_items + ' --cis-window ' + str(cis_window)) if fanout_kind == 'gene' else ('--region ' + _fanout_items)} \\\n",
+    "        --fine-mapping-result ${fine_mapping_result if fine_mapping_result.is_file() else '\"\"'} \\\n",
+    "        --output ${_output}\n"
    ]
   }
  ],
diff --git a/code/SoS/reference_data/rss_ld_sketch.ipynb b/code/SoS/reference_data/rss_ld_sketch.ipynb
index ef273f0a..c1ac6c9a 100644
--- a/code/SoS/reference_data/rss_ld_sketch.ipynb
+++ b/code/SoS/reference_data/rss_ld_sketch.ipynb
@@ -149,64 +149,6 @@
     "    --cwd output/rss_ld_sketch"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "2f5835fb",
-   "metadata": {},
-   "source": [
-    "### Step 4. Load the LD sketch data into R\n",
-    "\n",
-    "Load the final per-chromosome pgen output into R with `pecotmr::load_LD_matrix()` via a metadata TSV. The metadata TSV has one row per chromosome with `start=0`, `end=0`, and `path` set to the PLINK file prefix (resolved relative to the TSV directory):\n",
-    "\n",
-    "```\n",
-    "#chrom start end path\n",
-    "22 0 0 protocol_example.chr22\n",
-    "```\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d8e24892",
-   "metadata": {},
-   "source": [
-    "**Tested output** (ADSP R5 EUR chr22, 10000 pseudo-samples, 2Mb region, 9188 variants):\n",
-    "\n",
-    "```\n",
-    "X dimensions (samples x variants): 10000 9188\n",
-    "R dimensions: 9188 9188\n",
-    "```\n",
-    "\n",
-    "Use `return_genotype = TRUE` for the `susie_rss(z, X = X)` interface, or `FALSE` for the `susie_rss(z, R = R)` interface. For repeated analyses on the same region, compute `R` once and save it with `saveRDS()` for reuse.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "8a05b576",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "library(pecotmr)\n",
-    "\n",
-    "meta <- \"/path/to/ld_meta_chr22.tsv\"\n",
-    "region <- \"chr22:30000000-32000000\"  # 2Mb region\n",
-    "\n",
-    "# Load genotype matrix X (for susie_rss z+X interface)\n",
-    "system.time({\n",
-    "  res_x <- load_LD_matrix(meta, region, return_genotype = TRUE)\n",
-    "})\n",
-    "cat(\"X dimensions (samples x variants):\", dim(res_x$LD_matrix), \"\\n\")\n",
-    "head(res_x$ref_panel, 3)\n",
-    "res_x$LD_matrix[1:3, 1:3]\n",
-    "\n",
-    "# Load LD correlation matrix R (for susie_rss z+R interface)\n",
-    "system.time({\n",
-    "  res_r <- load_LD_matrix(meta, region, return_genotype = FALSE)\n",
-    "})\n",
-    "cat(\"R dimensions:\", dim(res_r$LD_matrix), \"\\n\")\n",
-    "res_r$LD_matrix[1:3, 1:3]"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "32c022be",
diff --git a/code/script/pecotmr_integration/fine_mapping.R b/code/script/pecotmr_integration/fine_mapping.R
index a9e81d0a..5148fa0d 100644
--- a/code/script/pecotmr_integration/fine_mapping.R
+++ b/code/script/pecotmr_integration/fine_mapping.R
@@ -22,6 +22,13 @@
 # Shared:
 #   --methods       Comma-separated method tokens. Default "susie".
 #   --coverage      SuSiE credible-set coverage. Default 0.95.
+#   --method-args   Optional JSON object of per-method kwargs spliced
+#                   into fineMappingPipeline() via its named-list
+#                   methods= argument. Keys are method tokens (must be
+#                   a subset of --methods); values are kwargs lists
+#                   forwarded to the underlying fitter (susieR::susie,
+#                   susieR::susie_rss, mvsusieR::mvsusie, fsusieR::susiF,
+#                   etc.). Example: '{"susie":{"L":1,"refine":false}}'.
 #   --output        Output RDS path.
 
 suppressPackageStartupMessages({
@@ -29,6 +36,7 @@ suppressPackageStartupMessages({
   library(pecotmr)
   library(GenomicRanges)
   library(IRanges)
+  library(jsonlite)
 })
 
 parser <- arg_parser("SuSiE fine-mapping over a pecotmr S4 input (QtlDataset or GwasSumStats)")
@@ -53,10 +61,36 @@ parser <- add_argument(parser, "--methods",
 parser <- add_argument(parser, "--coverage",
                        help = "SuSiE credible-set coverage",
                        type = "numeric", default = 0.95)
+parser <- add_argument(parser, "--method-args",
+                       help = "JSON object {token: {kwarg: value, ...}, ...} for fineMappingPipeline()",
+                       type = "character", default = "")
 parser <- add_argument(parser, "--output",
                        help = "Output RDS path", type = "character")
 argv <- parse_args(parser)
 
+# Parse --method-args into a nested named list of per-method kwargs.
+# Empty / '.' / '{}' all yield NULL (which makes us pass the char-vector
+# form of methods= to fineMappingPipeline).
+parsed_method_args <- if (nzchar(argv$method_args) && argv$method_args != "." &&
+                          argv$method_args != "{}") {
+  parsed <- tryCatch(jsonlite::fromJSON(argv$method_args, simplifyVector = FALSE),
+                     error = function(e) stop(
+                       "--method-args must be a JSON object string (got: ",
+                       argv$method_args, "). Error: ", conditionMessage(e)))
+  if (!is.list(parsed) || is.null(names(parsed)) || any(names(parsed) == ""))
+    stop("--method-args must be a JSON object whose keys are method ",
+         "tokens, e.g. '{\"susie\":{\"L\":1}}'.")
+  nonObject <- vapply(parsed, function(x) !is.list(x), logical(1))
+  if (any(nonObject))
+    stop("--method-args: each value must itself be an object of kwargs ",
+         "(got non-object for: ",
+         paste(names(parsed)[nonObject], collapse = ", "),
+         "). Did you mean '{\"susie\":", argv$method_args, "}'?")
+  parsed
+} else {
+  NULL
+}
+
 parse_region <- function(s) {
   m <- regmatches(s, regexec("^([^:]+):([0-9]+)-([0-9]+)$", s))[[1L]]
   if (length(m) != 4L)
@@ -75,13 +109,33 @@ if (!has_qtl && !has_gwas)
 
 methods <- trimws(strsplit(argv$methods, ",", fixed = TRUE)[[1L]])
 
+# Build the final `methods` argument for fineMappingPipeline. When the
+# user supplied --method-args, validate that every key is one of the
+# --methods tokens (no silent typos) and build the named-list form
+# {token: kwargs} that fineMappingPipeline expects. Every token in
+# --methods is represented in the list (with an empty kwargs list when
+# the user didn't supply one). When --method-args is empty we just pass
+# the character vector, which fineMappingPipeline accepts unchanged.
+methods_arg <- if (is.null(parsed_method_args)) {
+  methods
+} else {
+  unknown <- setdiff(names(parsed_method_args), methods)
+  if (length(unknown) > 0L)
+    stop("--method-args has keys not listed in --methods (got '",
+         paste(unknown, collapse = ", "),
+         "'; --methods = '", paste(methods, collapse = ", "), "').")
+  setNames(lapply(methods, function(tk) {
+    if (tk %in% names(parsed_method_args)) parsed_method_args[[tk]]
+    else list()
+  }), methods)
+}
+
 if (has_gwas) {
   # ----- GWAS mode -------------------------------------------------------
   gss <- readRDS(argv$gwas_sumstats)
-  res <- fineMappingPipeline(
-    gss,
-    methods  = methods,
-    coverage = argv$coverage)
+  res <- fineMappingPipeline(gss,
+                             methods  = methods_arg,
+                             coverage = argv$coverage)
   label <- paste0("GwasSumStats '", basename(argv$gwas_sumstats), "'")
 } else {
   # ----- QTL mode --------------------------------------------------------
@@ -93,19 +147,15 @@ if (has_gwas) {
     stop("QTL mode requires --gene-id (with --cis-window) or --region.")
   qd <- readRDS(argv$qtl_dataset)
   res <- if (has_region) {
-    fineMappingPipeline(
-      qd,
-      methods   = methods,
-      region    = parse_region(argv$region),
-      cisWindow = argv$cis_window,
-      coverage  = argv$coverage)
+    fineMappingPipeline(qd, methods = methods_arg,
+                        region    = parse_region(argv$region),
+                        cisWindow = argv$cis_window,
+                        coverage  = argv$coverage)
   } else {
-    fineMappingPipeline(
-      qd,
-      methods    = methods,
-      traitId    = argv$gene_id,
-      cisWindow  = argv$cis_window,
-      coverage   = argv$coverage)
+    fineMappingPipeline(qd, methods = methods_arg,
+                        traitId   = argv$gene_id,
+                        cisWindow = argv$cis_window,
+                        coverage  = argv$coverage)
   }
   label <- if (has_region) paste0("region '", argv$region, "'")
            else paste0("gene '", argv$gene_id, "'")
diff --git a/code/script/pecotmr_integration/fine_mapping_export.R b/code/script/pecotmr_integration/fine_mapping_export.R
new file mode 100644
index 00000000..0f46eb0b
--- /dev/null
+++ b/code/script/pecotmr_integration/fine_mapping_export.R
@@ -0,0 +1,126 @@
+#!/usr/bin/env Rscript
+# fine_mapping_export.R
+#
+# Export a view (topLoci / credibleSets / pip / marginals) of one or more
+# QtlFineMappingResult / GwasFineMappingResult RDSes as a single concatenated
+# TSV. Adds identifier columns derived from the FMR row (study, context,
+# trait, method for QTL; study, method, region_id for GWAS) so a downstream
+# consumer can split the table back per-tuple.
+#
+# Inputs:
+#   --input <RDS> [<RDS> ...]  One or more FineMappingResult RDS paths.
+#   --view {topLoci|cs|pip|marginals}  Which per-entry view to export.
+#                                        Default "topLoci".
+#   --signal-cutoff  PIP threshold for topLoci/pip exports. Default 0
+#                    (no filter). Pass 0.025 to mirror the legacy
+#                    susie default.
+#   --output <TSV>  Output TSV path (gzipped if path ends in .gz).
+
+suppressPackageStartupMessages({
+  library(argparser)
+  library(pecotmr)
+})
+
+parser <- arg_parser("Export a FineMappingResult view as a TSV")
+parser <- add_argument(parser, "--input",
+                       help = "One or more FineMappingResult RDS paths",
+                       type = "character", nargs = Inf)
+parser <- add_argument(parser, "--view",
+                       help = "Which view to export: topLoci | cs | pip | marginals",
+                       type = "character", default = "topLoci")
+parser <- add_argument(parser, "--signal-cutoff",
+                       help = "PIP cutoff for topLoci/pip exports",
+                       type = "numeric", default = 0)
+parser <- add_argument(parser, "--output",
+                       help = "Output TSV path", type = "character")
+argv <- parse_args(parser)
+
+view <- match.arg(argv$view, c("topLoci", "cs", "pip", "marginals"))
+inputs <- as.character(argv$input)
+if (length(inputs) == 0L)
+  stop("--input requires at least one RDS path.")
+
+# Pull the row's identifier columns regardless of QTL vs GWAS shape; the
+# union of slots produces a NA-padded data frame the downstream rbind can
+# concatenate across mixed inputs.
+.idCols <- function(fmr, i) {
+  list(
+    study     = as.character(fmr$study)[[i]],
+    context   = if ("context"   %in% names(fmr)) as.character(fmr$context)[[i]]   else NA_character_,
+    trait     = if ("trait"     %in% names(fmr)) as.character(fmr$trait)[[i]]     else NA_character_,
+    region_id = if ("region_id" %in% names(fmr)) as.character(fmr$region_id)[[i]] else NA_character_,
+    method    = as.character(fmr$method)[[i]],
+    source    = basename(attr(fmr, ".source", exact = TRUE) %||% ""))
+}
+`%||%` <- function(a, b) if (is.null(a)) b else a
+
+.extract <- function(entry, view, cutoff) {
+  if (view == "topLoci") {
+    df <- as.data.frame(getTopLoci(entry, signalCutoff = cutoff))
+  } else if (view == "cs") {
+    df <- as.data.frame(getCs(entry))
+  } else if (view == "pip") {
+    pip <- as.numeric(getPip(entry))
+    ids <- as.character(getVariantIds(entry))
+    df  <- data.frame(variant_id = ids, pip = pip,
+                       stringsAsFactors = FALSE)
+    if (cutoff > 0) df <- df[df$pip > cutoff, , drop = FALSE]
+  } else {  # marginals
+    df <- as.data.frame(getMarginalEffects(entry))
+  }
+  if (nrow(df) == 0L) return(NULL)
+  df
+}
+
+pieces <- list()
+for (path in inputs) {
+  fmr <- readRDS(path)
+  if (!methods::is(fmr, "FineMappingResultBase")) {
+    warning("Skipping non-FineMappingResult input: ", path)
+    next
+  }
+  attr(fmr, ".source") <- path
+  for (i in seq_len(nrow(fmr))) {
+    entry <- fmr$entry[[i]]
+    inner <- tryCatch(.extract(entry, view, argv$signal_cutoff),
+                      error = function(e) {
+                        message("Entry ", i, " of ", basename(path),
+                                ": ", conditionMessage(e))
+                        NULL
+                      })
+    if (is.null(inner)) next
+    ids <- .idCols(fmr, i)
+    ids$source <- basename(path)
+    for (k in names(ids)) inner[[k]] <- ids[[k]]
+    pieces[[length(pieces) + 1L]] <- inner
+  }
+}
+if (length(pieces) == 0L) {
+  message("No rows produced; writing an empty TSV with the id-column ",
+          "header so downstream consumers don't error on missing files.")
+  out <- data.frame(study = character(0), context = character(0),
+                    trait = character(0), region_id = character(0),
+                    method = character(0), source = character(0),
+                    stringsAsFactors = FALSE)
+} else {
+  # Pad missing columns across pieces so rbind doesn't lose rows.
+  all_cols <- unique(unlist(lapply(pieces, names)))
+  for (k in seq_along(pieces)) {
+    miss <- setdiff(all_cols, names(pieces[[k]]))
+    for (m in miss) pieces[[k]][[m]] <- NA
+    pieces[[k]] <- pieces[[k]][, all_cols, drop = FALSE]
+  }
+  out <- do.call(rbind, pieces)
+}
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+conn <- if (grepl("\\.gz$", argv$output)) {
+  gzfile(argv$output, "w")
+} else {
+  file(argv$output, "w")
+}
+write.table(out, file = conn, sep = "\t", quote = FALSE,
+            row.names = FALSE, na = "")
+close(conn)
+cat(sprintf("Wrote %s view (%d rows from %d input RDS file(s)) to %s\n",
+            view, nrow(out), length(inputs), argv$output))
diff --git a/code/script/pecotmr_integration/fine_mapping_pip_plot.R b/code/script/pecotmr_integration/fine_mapping_pip_plot.R
new file mode 100644
index 00000000..b1aba5fe
--- /dev/null
+++ b/code/script/pecotmr_integration/fine_mapping_pip_plot.R
@@ -0,0 +1,108 @@
+#!/usr/bin/env Rscript
+# fine_mapping_pip_plot.R
+#
+# Render a per-region PIP landscape plot from a QtlFineMappingResult or
+# GwasFineMappingResult RDS (output of fine_mapping.ipynb /
+# multivariate_fine_mapping.ipynb / functional_fine_mapping.ipynb /
+# gwas_rss_fine_mapping.ipynb). One PNG per input RDS, one facet panel
+# per row in the FMR.
+#
+# Inputs:
+#   --input    Path to a FineMappingResult RDS
+#   --output   Output PNG path
+#   --width    PNG width (inches). Default 9.
+#   --height   Per-panel PNG height (inches). Default 3.
+
+suppressPackageStartupMessages({
+  library(argparser)
+  library(pecotmr)
+  library(ggplot2)
+})
+
+parser <- arg_parser("Render a per-region PIP plot from a FineMappingResult RDS")
+parser <- add_argument(parser, "--input",
+                       help = "Path to a FineMappingResult RDS",
+                       type = "character")
+parser <- add_argument(parser, "--output",
+                       help = "Output PNG path", type = "character")
+parser <- add_argument(parser, "--width",
+                       help = "PNG width in inches",
+                       type = "numeric", default = 9)
+parser <- add_argument(parser, "--height",
+                       help = "Per-panel PNG height in inches",
+                       type = "numeric", default = 3)
+argv <- parse_args(parser)
+
+fmr <- readRDS(argv$input)
+if (!methods::is(fmr, "FineMappingResultBase"))
+  stop("--input must be a FineMappingResultBase subclass (got '",
+       class(fmr)[[1L]], "').")
+
+write_empty <- function(msg) {
+  message(msg, "; writing an empty plot to ", argv$output)
+  dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+  png(argv$output, width = argv$width * 100, height = argv$height * 100)
+  plot.new(); title(main = msg)
+  dev.off()
+}
+
+if (nrow(fmr) == 0L) {
+  write_empty("Empty FineMappingResult")
+  quit(save = "no")
+}
+
+# Build a long tidy table of (panel-id, variant_id, pip) using the S4
+# accessors. The panel label varies by FMR shape: GWAS keeps region_id,
+# QTL uses (context, trait).
+isGwas <- methods::is(fmr, "GwasFineMappingResult")
+panels <- lapply(seq_len(nrow(fmr)), function(i) {
+  entry <- fmr$entry[[i]]
+  pip   <- as.numeric(getPip(entry))
+  ids   <- as.character(getVariantIds(entry))
+  if (length(pip) == 0L || length(ids) == 0L) return(NULL)
+  panel <- if (isGwas) {
+    sprintf("%s | %s | %s",
+            as.character(fmr$study)[[i]],
+            as.character(fmr$method)[[i]],
+            as.character(fmr$region_id)[[i]])
+  } else {
+    sprintf("%s | %s | %s | %s",
+            as.character(fmr$study)[[i]],
+            as.character(fmr$context)[[i]],
+            as.character(fmr$trait)[[i]],
+            as.character(fmr$method)[[i]])
+  }
+  data.frame(panel = panel, variant_id = ids, pip = pip,
+             stringsAsFactors = FALSE)
+})
+panels <- panels[!vapply(panels, is.null, logical(1))]
+if (length(panels) == 0L) {
+  write_empty("No usable PIP vectors on any entry")
+  quit(save = "no")
+}
+df <- do.call(rbind, panels)
+
+# Best-effort variant-id → position decode; falls back to row index.
+pos <- suppressWarnings({
+  m <- regmatches(df$variant_id,
+                  regexec("^[^:_]+[:_]([0-9]+)", df$variant_id))
+  vapply(m, function(x) if (length(x) >= 2L) as.numeric(x[[2L]]) else NA_real_,
+         numeric(1L))
+})
+df$pos <- ifelse(is.na(pos), seq_len(nrow(df)), pos)
+
+g <- ggplot(df, aes(x = pos, y = pip)) +
+  geom_point(alpha = 0.6, size = 0.8) +
+  facet_wrap(~ panel, ncol = 1, scales = "free_x") +
+  scale_y_continuous(limits = c(0, 1)) +
+  labs(x = "Variant position", y = "PIP",
+       title = paste0(class(fmr)[[1L]], ": ", basename(argv$input))) +
+  theme_minimal(base_size = 10)
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+ggsave(argv$output, g,
+       width  = argv$width,
+       height = argv$height * length(unique(df$panel)),
+       dpi    = 150)
+cat(sprintf("Wrote PIP plot for %d panel(s) (%d variants) to %s\n",
+            length(unique(df$panel)), nrow(df), argv$output))
diff --git a/code/script/pecotmr_integration/fine_mapping_upset.R b/code/script/pecotmr_integration/fine_mapping_upset.R
new file mode 100644
index 00000000..4e7b5dcc
--- /dev/null
+++ b/code/script/pecotmr_integration/fine_mapping_upset.R
@@ -0,0 +1,110 @@
+#!/usr/bin/env Rscript
+# fine_mapping_upset.R
+#
+# Render an UpSet plot of credible-set variant overlap across one or more
+# FineMappingResult RDSes. Each (study, context, trait, method, region_id)
+# tuple becomes one set; the set members are the variant IDs in the union
+# of that tuple's credible sets. Useful for visualizing which signals are
+# shared across studies / contexts / traits.
+#
+# Inputs:
+#   --input <RDS> [<RDS> ...]  One or more FineMappingResult RDS paths.
+#   --output <PNG>             Output PNG path.
+#   --max-sets                 Cap on the number of sets to render (UpSet
+#                              gets noisy past ~10). Default 20. Sets are
+#                              ranked by their CS size (descending) and
+#                              the top --max-sets are kept.
+#   --width / --height         PNG dimensions in inches. Defaults 12 x 6.
+
+suppressPackageStartupMessages({
+  library(argparser)
+  library(pecotmr)
+})
+
+parser <- arg_parser("UpSet plot of credible-set variant overlap across FineMappingResult RDSes")
+parser <- add_argument(parser, "--input",
+                       help = "One or more FineMappingResult RDS paths",
+                       type = "character", nargs = Inf)
+parser <- add_argument(parser, "--output",
+                       help = "Output PNG path", type = "character")
+parser <- add_argument(parser, "--max-sets",
+                       help = "Maximum number of sets to render",
+                       type = "integer", default = 20L)
+parser <- add_argument(parser, "--width",
+                       help = "PNG width (inches)",
+                       type = "numeric", default = 12)
+parser <- add_argument(parser, "--height",
+                       help = "PNG height (inches)",
+                       type = "numeric", default = 6)
+argv <- parse_args(parser)
+
+if (!requireNamespace("UpSetR", quietly = TRUE))
+  stop("fine_mapping_upset.R requires the UpSetR package: ",
+       "install.packages('UpSetR')")
+
+inputs <- as.character(argv$input)
+if (length(inputs) == 0L)
+  stop("--input requires at least one RDS path.")
+
+# Walk every (FMR-row x CS) combination and build a named list of
+# {set-label -> variant_ids} that UpSetR consumes via fromList().
+sets <- list()
+for (path in inputs) {
+  fmr <- readRDS(path)
+  if (!methods::is(fmr, "FineMappingResultBase")) {
+    warning("Skipping non-FineMappingResult input: ", path)
+    next
+  }
+  isGwas <- methods::is(fmr, "GwasFineMappingResult")
+  for (i in seq_len(nrow(fmr))) {
+    entry <- fmr$entry[[i]]
+    cs_df <- tryCatch(as.data.frame(getCs(entry)),
+                      error = function(e) NULL)
+    if (is.null(cs_df) || nrow(cs_df) == 0L) next
+    if (!"variant_id" %in% names(cs_df)) next
+    label <- if (isGwas) {
+      sprintf("%s|%s|%s",
+              as.character(fmr$study)[[i]],
+              as.character(fmr$method)[[i]],
+              as.character(fmr$region_id)[[i]])
+    } else {
+      sprintf("%s|%s|%s|%s",
+              as.character(fmr$study)[[i]],
+              as.character(fmr$context)[[i]],
+              as.character(fmr$trait)[[i]],
+              as.character(fmr$method)[[i]])
+    }
+    sets[[label]] <- unique(as.character(cs_df$variant_id))
+  }
+}
+
+if (length(sets) == 0L) {
+  message("No credible sets found across any input; writing an empty plot.")
+  dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+  png(argv$output, width = argv$width * 100, height = argv$height * 100)
+  plot.new(); title(main = "No credible sets in inputs")
+  dev.off()
+  quit(save = "no")
+}
+
+# Rank sets by size descending, cap at --max-sets to keep the plot readable.
+sizes <- vapply(sets, length, integer(1L))
+keep  <- order(sizes, decreasing = TRUE)[seq_len(min(argv$max_sets, length(sets)))]
+sets  <- sets[keep]
+if (length(sets) < length(sizes))
+  message("Rendering top ", argv$max_sets, " set(s) by size; ",
+          length(sizes) - length(sets), " smaller set(s) dropped.")
+
+binary <- UpSetR::fromList(sets)
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+png(argv$output, width = argv$width * 100, height = argv$height * 100)
+print(UpSetR::upset(
+  binary,
+  nsets   = length(sets),
+  order.by = "freq",
+  mainbar.y.label = "CS-variant intersection size",
+  sets.x.label    = "CS variants per (study, context, trait, method)"))
+dev.off()
+cat(sprintf("Wrote UpSet plot of %d set(s) to %s\n",
+            length(sets), argv$output))
diff --git a/code/script/pecotmr_integration/gwas_rss_manifest.R b/code/script/pecotmr_integration/gwas_rss_manifest.R
new file mode 100644
index 00000000..03ab7d54
--- /dev/null
+++ b/code/script/pecotmr_integration/gwas_rss_manifest.R
@@ -0,0 +1,177 @@
+#!/usr/bin/env Rscript
+# gwas_rss_manifest.R
+#
+# Resolve `gwas_rss_fine_mapping.ipynb`'s per-study GWAS sources (a
+# --gwas-meta TSV and/or --gwas-tsv-list STUDY=PATH items) together with
+# its region inputs (a --region-list BED-ish file and/or --regions
+# chr:start-end strings) into a single per-job manifest TSV. One row per
+# (study x region) cross-product. Downstream SoS steps fan out over this
+# TSV's rows without needing any custom Python parsing.
+#
+# Inputs (zero or more of each input pair; at least one of each pair):
+#   --gwas-meta <TSV>           Optional. Columns: study_id, path,
+#                                column_mapping (optional). Relative
+#                                paths resolve against the meta file's
+#                                own directory.
+#   --gwas-tsv-list STUDY=PATH  Optional. One or more positional items.
+#                                (zero or more)              The notebook may pass each STUDY=PATH item
+#                                                            as a separate positional arg.
+#   --region-list <BED>         Optional. Whitespace-separated #chr start
+#                                end ...; lines starting with '#' or with
+#                                non-numeric start/end are skipped.
+#   --regions chr:start-end     Optional. One or more positional items.
+#                                (zero or more)
+#   --output <TSV>              Output manifest path.
+#
+# Output TSV columns:
+#   study_id, gwas_tsv, column_mapping, chr, start, end, region_id,
+#   gwas_tsv_basename
+#
+# `region_id` is the SoS-safe sanitised region label (`:` and `-` replaced
+# with `_`); the downstream notebook uses it as the per-task ID.
+# `gwas_tsv_basename` is the basename of the TSV without extension — the
+# notebook uses it in the per-task output filename for traceability.
+
+suppressPackageStartupMessages({
+  library(argparser)
+})
+
+parser <- arg_parser("Resolve GWAS sources x regions into a per-task manifest TSV")
+parser <- add_argument(parser, "--gwas-meta",
+                       help = "Optional per-study meta TSV",
+                       type = "character", default = "")
+parser <- add_argument(parser, "--gwas-tsv-list",
+                       help = "Zero or more STUDY=PATH items",
+                       type = "character", nargs = Inf, default = character(0))
+parser <- add_argument(parser, "--region-list",
+                       help = "Optional BED-like region file",
+                       type = "character", default = "")
+parser <- add_argument(parser, "--regions",
+                       help = "Zero or more chr:start-end items",
+                       type = "character", nargs = Inf, default = character(0))
+parser <- add_argument(parser, "--output",
+                       help = "Output manifest TSV path",
+                       type = "character")
+argv <- parse_args(parser)
+
+# ----- Resolve per-study sources ----------------------------------------
+studies <- data.frame(study_id = character(0),
+                      gwas_tsv = character(0),
+                      column_mapping = character(0),
+                      stringsAsFactors = FALSE)
+seenStudies <- character(0)
+
+if (nzchar(argv$gwas_meta) && argv$gwas_meta != ".") {
+  if (!file.exists(argv$gwas_meta))
+    stop("--gwas-meta file not found: ", argv$gwas_meta)
+  meta <- read.table(argv$gwas_meta, header = TRUE, sep = "\t",
+                     stringsAsFactors = FALSE, check.names = FALSE,
+                     comment.char = "")
+  required <- c("study_id", "path")
+  missing <- setdiff(required, names(meta))
+  if (length(missing) > 0L)
+    stop("--gwas-meta missing required column(s): ",
+         paste(missing, collapse = ", "), " (got: ",
+         paste(names(meta), collapse = ", "), ").")
+  metaDir <- dirname(normalizePath(argv$gwas_meta))
+  hasCm <- "column_mapping" %in% names(meta)
+  for (i in seq_len(nrow(meta))) {
+    sid <- as.character(meta$study_id[[i]])
+    tsv <- as.character(meta$path[[i]])
+    if (!startsWith(tsv, "/")) tsv <- file.path(metaDir, tsv)
+    cm  <- if (hasCm) as.character(meta$column_mapping[[i]]) else ""
+    if (!is.na(cm) && nzchar(cm) && !startsWith(cm, "/"))
+      cm <- file.path(metaDir, cm)
+    if (sid %in% seenStudies)
+      stop("Duplicate study_id in --gwas-meta: ", sid)
+    seenStudies <- c(seenStudies, sid)
+    studies <- rbind(studies,
+      data.frame(study_id = sid, gwas_tsv = tsv,
+                 column_mapping = if (is.na(cm)) "" else cm,
+                 stringsAsFactors = FALSE))
+  }
+}
+
+tsvItems <- as.character(argv$gwas_tsv_list)
+tsvItems <- tsvItems[nzchar(tsvItems)]
+for (item in tsvItems) {
+  if (!grepl("=", item, fixed = TRUE))
+    stop("--gwas-tsv-list expects STUDY=PATH items (got: ", item, ").")
+  parts <- regmatches(item, regexec("^([^=]+)=(.+)$", item))[[1L]]
+  if (length(parts) != 3L)
+    stop("Cannot parse --gwas-tsv-list item: ", item)
+  sid <- parts[[2L]]; tsv <- parts[[3L]]
+  if (sid %in% seenStudies)
+    stop("Study '", sid, "' appears in both --gwas-meta and ",
+         "--gwas-tsv-list.")
+  seenStudies <- c(seenStudies, sid)
+  studies <- rbind(studies,
+    data.frame(study_id = sid, gwas_tsv = tsv,
+               column_mapping = "", stringsAsFactors = FALSE))
+}
+if (nrow(studies) == 0L)
+  stop("No GWAS inputs supplied (give --gwas-meta and/or --gwas-tsv-list).")
+
+# ----- Resolve regions --------------------------------------------------
+regions <- data.frame(chr = character(0), start = integer(0),
+                      end = integer(0), stringsAsFactors = FALSE)
+pushRegion <- function(chr, start, end) {
+  if (!startsWith(chr, "chr")) chr <- paste0("chr", chr)
+  start <- as.integer(start); end <- as.integer(end)
+  if (is.na(start) || is.na(end)) return(invisible(NULL))
+  key <- paste(chr, start, end, sep = "|")
+  prior <- paste(regions$chr, regions$start, regions$end, sep = "|")
+  if (key %in% prior) return(invisible(NULL))
+  regions <<- rbind(regions,
+    data.frame(chr = chr, start = start, end = end,
+               stringsAsFactors = FALSE))
+}
+
+if (nzchar(argv$region_list) && argv$region_list != ".") {
+  if (!file.exists(argv$region_list))
+    stop("--region-list file not found: ", argv$region_list)
+  rl <- readLines(argv$region_list)
+  for (line in rl) {
+    line <- trimws(line)
+    if (!nzchar(line) || startsWith(line, "#")) next
+    parts <- strsplit(line, "\\s+")[[1L]]
+    if (length(parts) < 3L) next
+    if (is.na(suppressWarnings(as.integer(parts[[2L]])))) next  # header
+    pushRegion(parts[[1L]], parts[[2L]], parts[[3L]])
+  }
+}
+
+regionItems <- as.character(argv$regions)
+regionItems <- regionItems[nzchar(regionItems)]
+for (r in regionItems) {
+  m <- regmatches(r, regexec("^([^:]+):([0-9]+)-([0-9]+)$", r))[[1L]]
+  if (length(m) != 4L)
+    stop("--regions item must be chr:start-end (got: ", r, ").")
+  pushRegion(m[[2L]], m[[3L]], m[[4L]])
+}
+if (nrow(regions) == 0L)
+  stop("No regions supplied (give --region-list and/or --regions).")
+
+# ----- Build cross-product manifest -------------------------------------
+manifest <- list()
+for (i in seq_len(nrow(studies))) {
+  for (j in seq_len(nrow(regions))) {
+    chr <- regions$chr[[j]]; s <- regions$start[[j]]; e <- regions$end[[j]]
+    region_id <- gsub("[:\\-]", "_", sprintf("%s_%d_%d", chr, s, e))
+    manifest[[length(manifest) + 1L]] <- data.frame(
+      study_id          = studies$study_id[[i]],
+      gwas_tsv          = studies$gwas_tsv[[i]],
+      column_mapping    = studies$column_mapping[[i]],
+      chr               = chr, start = s, end = e,
+      region_id         = region_id,
+      gwas_tsv_basename = tools::file_path_sans_ext(
+                            tools::file_path_sans_ext(basename(studies$gwas_tsv[[i]]))),
+      stringsAsFactors  = FALSE)
+  }
+}
+out <- do.call(rbind, manifest)
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+write.table(out, file = argv$output, sep = "\t", quote = FALSE,
+            row.names = FALSE, na = "")
+cat(sprintf("Wrote manifest with %d row(s) (%d studies x %d regions) to %s\n",
+            nrow(out), nrow(studies), nrow(regions), argv$output))
diff --git a/code/script/pecotmr_integration/gwas_rss_plot.R b/code/script/pecotmr_integration/gwas_rss_plot.R
new file mode 100644
index 00000000..aa1f5367
--- /dev/null
+++ b/code/script/pecotmr_integration/gwas_rss_plot.R
@@ -0,0 +1,99 @@
+#!/usr/bin/env Rscript
+# gwas_rss_plot.R
+#
+# Render a per-region PIP plot from a pecotmr::GwasFineMappingResult RDS
+# (output of fine_mapping.R --gwas-sumstats). One PNG per input RDS;
+# each row in the FMR produces one panel (study × method × region_id).
+# Replaces the legacy pipeline/rss_analysis.ipynb [univariate_plot] step.
+#
+# Inputs:
+#   --input    Path to a GwasFineMappingResult RDS
+#   --output   Output PNG path
+#   --width    PNG width (inches). Default 9.
+#   --height   PNG height (inches). Default 3 per panel.
+
+suppressPackageStartupMessages({
+  library(argparser)
+  library(pecotmr)
+  library(ggplot2)
+})
+
+parser <- arg_parser("Render a PIP plot from a GwasFineMappingResult RDS")
+parser <- add_argument(parser, "--input",
+                       help = "Path to a GwasFineMappingResult RDS",
+                       type = "character")
+parser <- add_argument(parser, "--output",
+                       help = "Output PNG path", type = "character")
+parser <- add_argument(parser, "--width",
+                       help = "PNG width in inches",
+                       type = "numeric", default = 9)
+parser <- add_argument(parser, "--height",
+                       help = "Per-panel PNG height in inches",
+                       type = "numeric", default = 3)
+argv <- parse_args(parser)
+
+fmr <- readRDS(argv$input)
+if (!methods::is(fmr, "GwasFineMappingResult"))
+  stop("--input must be a GwasFineMappingResult (got '",
+       class(fmr)[[1L]], "').")
+if (nrow(fmr) == 0L) {
+  message("Empty GwasFineMappingResult; writing an empty plot to ", argv$output)
+  png(argv$output, width = argv$width * 100, height = argv$height * 100)
+  plot.new(); title(main = "No fine-mapping rows in input")
+  dev.off()
+  quit(save = "no")
+}
+
+# Build a tidy long table of (study, method, region_id, variant_id, pip)
+# straight from the S4 accessors; one panel per row of the FMR.
+panels <- lapply(seq_len(nrow(fmr)), function(i) {
+  entry  <- fmr$entry[[i]]
+  pip    <- as.numeric(getPip(entry))
+  ids    <- as.character(getVariantIds(entry))
+  if (length(pip) == 0L || length(ids) == 0L) return(NULL)
+  data.frame(
+    study      = as.character(fmr$study)[[i]],
+    method     = as.character(fmr$method)[[i]],
+    region_id  = as.character(fmr$region_id)[[i]],
+    variant_id = ids,
+    pip        = pip,
+    stringsAsFactors = FALSE)
+})
+panels <- panels[!vapply(panels, is.null, logical(1))]
+if (length(panels) == 0L) {
+  message("No usable PIP vectors on any entry; writing an empty plot to ",
+          argv$output)
+  png(argv$output, width = argv$width * 100, height = argv$height * 100)
+  plot.new(); title(main = "No PIP vectors in input")
+  dev.off()
+  quit(save = "no")
+}
+df <- do.call(rbind, panels)
+
+# Extract a numeric pos when variant_id looks like "chrN:pos:..." or
+# "chrN_pos_..."; otherwise plot against the row index. This is a
+# best-effort cosmetic decode, not a hard requirement.
+pos <- suppressWarnings({
+  m <- regmatches(df$variant_id,
+                  regexec("^[^:_]+[:_]([0-9]+)", df$variant_id))
+  vapply(m, function(x) if (length(x) >= 2L) as.numeric(x[[2L]]) else NA_real_,
+         numeric(1L))
+})
+df$pos <- ifelse(is.na(pos), seq_len(nrow(df)), pos)
+df$panel <- paste(df$study, df$method, df$region_id, sep = " | ")
+
+g <- ggplot(df, aes(x = pos, y = pip)) +
+  geom_point(alpha = 0.6, size = 0.8) +
+  facet_wrap(~ panel, ncol = 1, scales = "free_x") +
+  scale_y_continuous(limits = c(0, 1)) +
+  labs(x = "Variant position", y = "PIP",
+       title = paste0("GWAS RSS fine-mapping: ", basename(argv$input))) +
+  theme_minimal(base_size = 10)
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+ggsave(argv$output, g,
+       width  = argv$width,
+       height = argv$height * length(unique(df$panel)),
+       dpi    = 150)
+cat(sprintf("Wrote PIP plot for %d panel(s) (%d total variants) to %s\n",
+            length(unique(df$panel)), nrow(df), argv$output))
diff --git a/code/script/pecotmr_integration/gwas_sumstats_construct.R b/code/script/pecotmr_integration/gwas_sumstats_construct.R
index 04f505c7..2abdd353 100644
--- a/code/script/pecotmr_integration/gwas_sumstats_construct.R
+++ b/code/script/pecotmr_integration/gwas_sumstats_construct.R
@@ -10,13 +10,32 @@
 # Inputs:
 #   --study           Single study identifier
 #   --gwas-tsv        Path to a tabix-indexed (or plain) GWAS TSV
-#                     Standard columns (with aliases):
+#                     Standard columns (with hardcoded aliases when no
+#                     --column-mapping is supplied):
 #                       #chrom|chrom|chr, pos, variant_id|SNP,
 #                       A1, A2, z|Z, n_sample|N
 #                     Optional: effect_allele_frequency, p, beta, se
+#   --column-mapping  Optional YAML file mapping standard column names
+#                     to this study's actual column names. Keys are
+#                     the standard names (chrom, pos, variant_id, A1,
+#                     A2, z, n_sample, and optionally beta, se, p, maf,
+#                     info); values are the column name as it appears
+#                     in the TSV. Required entries: chrom, pos,
+#                     variant_id, A1, A2, z, n_sample. Used in place
+#                     of the hardcoded alias fallback.
 #   --ld-block        Genomic interval as chr:start-end (one LD block)
 #   --ld-meta         LD-meta TSV (#chr, start, end, path)
 #   --genome          Genome build label (e.g. "GRCh38"). Default "GRCh38"
+#   --qc-method       LD-mismatch QC method passed to
+#                     summaryStatsQc(zMismatchQc = ...). One of "none"
+#                     (default), "slalom", "dentist".
+#   --impute          Flag: run RAISS sumstat imputation in
+#                     summaryStatsQc(impute = TRUE) against the LD
+#                     sketch.
+#   --qc-args         Optional JSON object of extra named kwargs
+#                     spliced into summaryStatsQc(). Anything in the
+#                     summaryStatsQc() signature is reachable here
+#                     (e.g. '{"mafCutoff":0.01,"imputeOpts":{"rcond":0.005}}').
 #   --output          Output RDS path
 
 suppressPackageStartupMessages({
@@ -25,6 +44,8 @@ suppressPackageStartupMessages({
   library(GenomicRanges)
   library(IRanges)
   library(S4Vectors)
+  library(jsonlite)
+  library(yaml)
 })
 
 parser <- arg_parser("Build a per-LD-block GwasSumStats RDS")
@@ -43,6 +64,18 @@ parser <- add_argument(parser, "--ld-meta",
 parser <- add_argument(parser, "--genome",
                        help = "Genome build label",
                        type = "character", default = "GRCh38")
+parser <- add_argument(parser, "--column-mapping",
+                       help = "Optional YAML file mapping standard column names to this study's column names",
+                       type = "character", default = "")
+parser <- add_argument(parser, "--qc-method",
+                       help = "LD-mismatch QC: 'none' (default), 'slalom', or 'dentist'",
+                       type = "character", default = "none")
+parser <- add_argument(parser, "--impute",
+                       help = "Enable RAISS sumstat imputation in summaryStatsQc()",
+                       flag = TRUE)
+parser <- add_argument(parser, "--qc-args",
+                       help = "JSON object of extra named kwargs for summaryStatsQc()",
+                       type = "character", default = "")
 parser <- add_argument(parser, "--skip-qc",
                        help = "Skip summaryStatsQc() (debug-only; the output GwasSumStats will have qcInfo = list() and fineMappingPipeline / twasWeightsPipeline will reject it)",
                        flag = TRUE)
@@ -50,6 +83,43 @@ parser <- add_argument(parser, "--output",
                        help = "Output RDS path", type = "character")
 argv <- parse_args(parser)
 
+# ----- Resolve column-name lookup -------------------------------------------
+# Either pull from a user-supplied YAML mapping (per-study explicit names)
+# or fall back to the hardcoded alias lists. The mapping must cover every
+# required standard name; optional names (BETA/SE/P/MAF/INFO) pass through
+# when present, regardless of source.
+column_mapping <- if (nzchar(argv$column_mapping) && argv$column_mapping != ".") {
+  if (!file.exists(argv$column_mapping))
+    stop("--column-mapping file not found: ", argv$column_mapping)
+  yaml::read_yaml(argv$column_mapping)
+} else {
+  NULL
+}
+
+# ----- Parse --qc-args JSON -------------------------------------------------
+qc_extra <- if (nzchar(argv$qc_args) && argv$qc_args != "." &&
+                argv$qc_args != "{}") {
+  parsed <- tryCatch(jsonlite::fromJSON(argv$qc_args, simplifyVector = FALSE),
+                     error = function(e) stop(
+                       "--qc-args must be a JSON object string (got: ",
+                       argv$qc_args, "). Error: ", conditionMessage(e)))
+  if (!is.list(parsed) || is.null(names(parsed)) || any(names(parsed) == ""))
+    stop("--qc-args must be a JSON object with named fields, e.g. ",
+         '\'{"mafCutoff":0.01,"removeStrandAmbiguous":false}\'.')
+  parsed
+} else {
+  list()
+}
+# Reject collisions between explicit flags and --qc-args (zMismatchQc /
+# impute) to avoid silent override surprises.
+clash <- intersect(names(qc_extra), c("zMismatchQc", "impute"))
+if (length(clash) > 0L)
+  stop("--qc-args sets ", paste(clash, collapse = ", "),
+       " which is also controlled by a dedicated flag (--qc-method / ",
+       "--impute). Pass it via the dedicated flag.")
+
+qc_method <- match.arg(argv$qc_method, c("none", "slalom", "dentist"))
+
 parse_region <- function(s) {
   m <- regmatches(s, regexec("^([^:]+):([0-9]+)-([0-9]+)$", s))[[1L]]
   if (length(m) != 4L)
@@ -67,17 +137,33 @@ gwas <- read.table(if (grepl("\\.gz$", argv$gwas_tsv)) gzfile(argv$gwas_tsv)
                    comment.char = "")
 
 pick <- function(opts, where) intersect(opts, names(where))[1L]
-chr_col <- pick(c("#chrom", "chrom", "chr"), gwas)
-pos_col <- pick(c("pos", "position", "BP"), gwas)
-snp_col <- pick(c("variant_id", "SNP", "rsid"), gwas)
-a1_col  <- pick(c("A1", "a1"), gwas)
-a2_col  <- pick(c("A2", "a2"), gwas)
-z_col   <- pick(c("z", "Z"), gwas)
-n_col   <- pick(c("n_sample", "N", "n"), gwas)
+
+# When the user provides a column mapping, use it directly; otherwise
+# fall back to the hardcoded alias lists below.
+resolve_col <- function(std, fallback) {
+  if (!is.null(column_mapping) && !is.null(column_mapping[[std]])) {
+    named <- column_mapping[[std]]
+    if (!(named %in% names(gwas)))
+      stop("--column-mapping['", std, "'] = '", named,
+           "' is not a column in ", argv$gwas_tsv)
+    return(named)
+  }
+  pick(fallback, gwas)
+}
+chr_col <- resolve_col("chrom",      c("#chrom", "chrom", "chr"))
+pos_col <- resolve_col("pos",        c("pos", "position", "BP"))
+snp_col <- resolve_col("variant_id", c("variant_id", "SNP", "rsid"))
+a1_col  <- resolve_col("A1",         c("A1", "a1"))
+a2_col  <- resolve_col("A2",         c("A2", "a2"))
+z_col   <- resolve_col("z",          c("z", "Z"))
+n_col   <- resolve_col("n_sample",   c("n_sample", "N", "n"))
 
 if (any(is.na(c(chr_col, pos_col, snp_col, a1_col, a2_col, z_col, n_col))))
   stop("--gwas-tsv missing one of required columns ",
-       "(chrom/pos/variant_id/A1/A2/z/n_sample) in: ", argv$gwas_tsv)
+       "(chrom/pos/variant_id/A1/A2/z/n_sample) in: ", argv$gwas_tsv,
+       if (!is.null(column_mapping))
+         " — check that every required key is present in --column-mapping"
+       else "")
 
 # Normalise chromosome label to match the LD block's
 chrom_vals <- as.character(gwas[[chr_col]])
@@ -179,7 +265,11 @@ gss_out <- if (argv$skip_qc) {
   message("--skip-qc set; serialising raw GwasSumStats without summaryStatsQc().")
   gss
 } else {
-  summaryStatsQc(gss)
+  qc_call_args <- c(list(gss,
+                          zMismatchQc = qc_method,
+                          impute      = isTRUE(argv$impute)),
+                    qc_extra)
+  do.call(summaryStatsQc, qc_call_args)
 }
 
 dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
diff --git a/code/script/pecotmr_integration/ld_blocks_to_manifest.R b/code/script/pecotmr_integration/ld_blocks_to_manifest.R
new file mode 100644
index 00000000..efe5b788
--- /dev/null
+++ b/code/script/pecotmr_integration/ld_blocks_to_manifest.R
@@ -0,0 +1,70 @@
+#!/usr/bin/env Rscript
+# ld_blocks_to_manifest.R
+#
+# Parse a BED-like LD-block file (`#chr`/`chrom` + `start` + `stop`/`end`
+# columns; arbitrary leading-`#` header line) into a 2-column manifest
+# TSV (`region`, `region_id`). The `region` is the canonical
+# `chr:start-end` string; `region_id` is the SoS-safe sanitised form
+# (`:` and `-` replaced with `_`) used as the per-task ID downstream.
+#
+# Inputs:
+#   --ld-blocks <BED>   Path to the LD-blocks BED.
+#   --output <TSV>      Output manifest TSV path.
+
+suppressPackageStartupMessages({
+  library(argparser)
+})
+
+parser <- arg_parser("Convert an LD-blocks BED into a region/region_id manifest")
+parser <- add_argument(parser, "--ld-blocks",
+                       help = "BED-like LD-block file",
+                       type = "character")
+parser <- add_argument(parser, "--output",
+                       help = "Output manifest TSV path",
+                       type = "character")
+argv <- parse_args(parser)
+
+if (!file.exists(argv$ld_blocks))
+  stop("--ld-blocks file not found: ", argv$ld_blocks)
+
+lines <- readLines(argv$ld_blocks)
+if (length(lines) == 0L)
+  stop("Empty --ld-blocks file: ", argv$ld_blocks)
+
+# Header: strip a leading '#' if present, split on whitespace.
+hdr <- strsplit(sub("^#", "", lines[[1L]]), "\\s+", perl = TRUE)[[1L]]
+chrCol  <- intersect(c("chr", "chrom"), hdr)[1L]
+startCol <- intersect(c("start", "Start"), hdr)[1L]
+endCol  <- intersect(c("stop", "end", "End"), hdr)[1L]
+if (is.na(chrCol) || is.na(startCol) || is.na(endCol))
+  stop("--ld-blocks header missing chr / start / stop|end columns; got: ",
+       paste(hdr, collapse = ", "))
+chrIx   <- match(chrCol,  hdr)
+startIx <- match(startCol, hdr)
+endIx   <- match(endCol,   hdr)
+
+rows <- list()
+for (l in lines[-1L]) {
+  l <- trimws(l)
+  if (!nzchar(l) || startsWith(l, "#")) next
+  parts <- strsplit(l, "\\s+", perl = TRUE)[[1L]]
+  if (length(parts) < max(chrIx, startIx, endIx)) next
+  chrom <- parts[[chrIx]]
+  start <- suppressWarnings(as.integer(parts[[startIx]]))
+  end   <- suppressWarnings(as.integer(parts[[endIx]]))
+  if (is.na(start) || is.na(end)) next
+  region <- sprintf("%s:%d-%d", chrom, start, end)
+  region_id <- gsub("[:\\-]", "_", region)
+  rows[[length(rows) + 1L]] <- data.frame(
+    region = region, region_id = region_id,
+    stringsAsFactors = FALSE)
+}
+if (length(rows) == 0L)
+  stop("No data rows parsed from --ld-blocks ", argv$ld_blocks)
+
+out <- do.call(rbind, rows)
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+write.table(out, file = argv$output, sep = "\t", quote = FALSE,
+            row.names = FALSE, na = "")
+cat(sprintf("Wrote LD-block manifest with %d row(s) to %s\n",
+            nrow(out), argv$output))
diff --git a/code/script/pecotmr_integration/legacy_mash_input_to_sumstatslist.R b/code/script/pecotmr_integration/legacy_mash_input_to_sumstatslist.R
new file mode 100644
index 00000000..ee03b1b0
--- /dev/null
+++ b/code/script/pecotmr_integration/legacy_mash_input_to_sumstatslist.R
@@ -0,0 +1,149 @@
+#!/usr/bin/env Rscript
+# legacy_mash_input_to_sumstatslist.R
+#
+# Standalone one-shot converter: reshape a pre-baked legacy MASH MWE
+# input file (`protocol_example.mash_input.rds`, the partitioned output
+# of the retired `susie_to_mash` workflow) into the named list of
+# QtlSumStats objects that `pecotmr::mashPipeline()` accepts directly.
+#
+# Not part of the production pipeline. Use this once to materialise a
+# `sumStatsList` RDS from the legacy MWE fixture so you can manually
+# smoke `mashPipeline()` end-to-end, e.g.:
+#
+#   Rscript legacy_mash_input_to_sumstatslist.R \
+#       --legacy-mash-input input/mash_preprocessing/protocol_example.mash_input.rds \
+#       --ld-sketch path/to/ld_sketch.rds \
+#       --study protocol_example \
+#       --output /tmp/sumstatslist.rds
+#
+#   Rscript -e 'suppressPackageStartupMessages(library(pecotmr));
+#               saveRDS(mashPipeline(readRDS("/tmp/sumstatslist.rds"), alpha = 0),
+#                       "/tmp/mash_result.rds")'
+#
+# Input shape (verified against the shipped MWE fixture):
+#   list(
+#     strong.z, strong.b, strong.s,   # matrices [n_variants_strong x n_conditions]
+#     random.z, random.b, random.s,   # matrices [n_variants_random x n_conditions]
+#     null.z,   null.b,   null.s,     # matrices [n_variants_null   x n_conditions]
+#     ZtZ)
+#
+# Output shape (one entry per partition, ready to splice into mashPipeline):
+#   list(
+#     strong = QtlSumStats(...),
+#     random = QtlSumStats(...),
+#     null   = QtlSumStats(...))   # null is optional
+#
+# Each QtlSumStats has one row per condition; the entry GRanges carries
+# the partition's z/beta/se vectors as mcols (with a synthesised
+# variant_id when the legacy matrices have no rownames). qcInfo is
+# pre-stamped non-empty so mashPipeline()'s QC gate accepts the result
+# without re-running summaryStatsQc.
+#
+# Inputs:
+#   --legacy-mash-input <RDS>   The pre-baked mash_input.rds.
+#   --ld-sketch <RDS>           GenotypeHandle RDS (required slot on
+#                                QtlSumStats; never read by mashPipeline,
+#                                only validated).
+#   --study                     Study label for the synthesised
+#                                QtlSumStats. Default "study".
+#   --output <RDS>              Output sumStatsList RDS path.
+
+suppressPackageStartupMessages({
+  library(argparser)
+  library(pecotmr)
+  library(GenomicRanges)
+  library(IRanges)
+  library(S4Vectors)
+})
+
+parser <- arg_parser("Convert legacy pre-baked mash_input.rds to a mashPipeline sumStatsList")
+parser <- add_argument(parser, "--legacy-mash-input",
+                       help = "Pre-baked MASH input RDS (strong/random/null .z/.b/.s matrices)",
+                       type = "character")
+parser <- add_argument(parser, "--ld-sketch",
+                       help = "GenotypeHandle RDS to attach to the QtlSumStats",
+                       type = "character")
+parser <- add_argument(parser, "--study",
+                       help = "Study label", type = "character", default = "study")
+parser <- add_argument(parser, "--output",
+                       help = "Output sumStatsList RDS path", type = "character")
+argv <- parse_args(parser)
+
+if (!file.exists(argv$legacy_mash_input))
+  stop("--legacy-mash-input file not found: ", argv$legacy_mash_input)
+if (!file.exists(argv$ld_sketch))
+  stop("--ld-sketch file not found: ", argv$ld_sketch)
+
+ld_handle <- readRDS(argv$ld_sketch)
+if (!methods::is(ld_handle, "GenotypeHandle"))
+  stop("--ld-sketch must deserialise to a GenotypeHandle (got '",
+       class(ld_handle)[[1L]], "').")
+
+mi <- readRDS(argv$legacy_mash_input)
+required <- c("strong.z", "random.z")
+missing  <- setdiff(required, names(mi))
+if (length(missing) > 0L)
+  stop("Pre-baked MASH input missing required component(s): ",
+       paste(missing, collapse = ", "),
+       " (got: ", paste(names(mi), collapse = ", "), ").")
+
+# Build a single QtlSumStats from one partition's matrices. Each
+# condition becomes one row of the collection; its entry GRanges holds
+# the partition's variants with Z/BETA/SE/N mcols pulled column-wise.
+toQss <- function(partition, role) {
+  zMat <- partition[["z"]]; bMat <- partition[["b"]]; sMat <- partition[["s"]]
+  if (is.null(zMat) || nrow(zMat) == 0L || ncol(zMat) == 0L) return(NULL)
+  vids <- rownames(zMat)
+  if (is.null(vids))
+    vids <- sprintf("var_%s_%05d", role, seq_len(nrow(zMat)))
+  contexts <- colnames(zMat)
+  if (is.null(contexts))
+    stop("Partition '", role, "' has no column names; cannot infer conditions.")
+  entries <- lapply(seq_along(contexts), function(j) {
+    mcols_df <- DataFrame(
+      SNP = vids,
+      A1  = rep("A", length(vids)),
+      A2  = rep("G", length(vids)),
+      Z   = as.numeric(zMat[, j]),
+      N   = rep(1000L, length(vids)))
+    if (!is.null(bMat)) mcols_df$BETA <- as.numeric(bMat[, j])
+    if (!is.null(sMat)) mcols_df$SE   <- as.numeric(sMat[, j])
+    gr <- GRanges(seqnames = rep("chr1", length(vids)),
+                  ranges   = IRanges(start = seq_along(vids), width = 1L))
+    mcols(gr) <- mcols_df
+    gr
+  })
+  QtlSumStats(
+    study    = rep(argv$study, length(contexts)),
+    context  = contexts,
+    trait    = rep("mash", length(contexts)),
+    entry    = entries,
+    genome   = "GRCh38",
+    ldSketch = ld_handle,
+    qcInfo   = list(source = "legacy_mash_input_to_sumstatslist",
+                    role   = role,
+                    entryAudit = vector("list", length(contexts))))
+}
+
+# Pull the three (or two) partitions out of the flat key naming
+# (`<role>.z` / `<role>.b` / `<role>.s`).
+partitionOf <- function(role) {
+  out <- list(z = mi[[paste0(role, ".z")]],
+              b = mi[[paste0(role, ".b")]],
+              s = mi[[paste0(role, ".s")]])
+  if (is.null(out$z)) NULL else out
+}
+
+sumStatsList <- list(
+  strong = toQss(partitionOf("strong"), "strong"),
+  random = toQss(partitionOf("random"), "random"))
+nullPart <- partitionOf("null")
+if (!is.null(nullPart)) sumStatsList$null <- toQss(nullPart, "null")
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+saveRDS(sumStatsList, argv$output, compress = "xz")
+cat(sprintf(paste0("Wrote sumStatsList for mashPipeline (strong=%d, random=%d, ",
+                   "null=%s rows x %d conditions) to %s\n"),
+            nrow(mi$strong.z), nrow(mi$random.z),
+            if (is.null(mi$null.z)) "none" else as.character(nrow(mi$null.z)),
+            ncol(mi$strong.z), argv$output))
diff --git a/code/script/pecotmr_integration/legacy_sumstats_db_to_mash.R b/code/script/pecotmr_integration/legacy_sumstats_db_to_mash.R
new file mode 100644
index 00000000..69433c81
--- /dev/null
+++ b/code/script/pecotmr_integration/legacy_sumstats_db_to_mash.R
@@ -0,0 +1,173 @@
+#!/usr/bin/env Rscript
+# legacy_sumstats_db_to_mash.R
+#
+# Convert the legacy `protocol_example.sumstats_db.rds` MASH fixture
+# (nested list keyed by condition -> region -> {variant_names, sumstats})
+# into the inputs the new `pecotmr_integration/mash_preprocessing.ipynb`
+# expects:
+#
+#   * per-condition tensorqtl-style TSVs (one per chromosome, gzipped),
+#     filename embeds the chrom so the notebook's `_match_paths_for_chrom`
+#     helper finds them.
+#   * one `<output-dir>/{condition}.sum_list.txt` per condition listing
+#     its per-chrom TSV paths (one line per chrom).
+#   * a `<output-dir>/regions.tsv` file with columns `chr start end gene_id`
+#     (one row per region present in the fixture).
+#   * a `<output-dir>/ld_sketch.rds` GenotypeHandle built from a supplied
+#     PLINK2 prefix.
+#
+# Inputs:
+#   --legacy-rds <RDS>     Path to protocol_example.sumstats_db.rds.
+#   --plink2-prefix <PFX>  PLINK2 pgen/pvar/psam prefix (no extension)
+#                          covering the variants in the fixture; the
+#                          new pipeline needs a GenotypeHandle to feed
+#                          mashPipeline()'s QC gate.
+#   --output-dir <DIR>     Output directory (created if absent).
+
+suppressPackageStartupMessages({
+  library(argparser)
+  library(pecotmr)
+})
+
+parser <- arg_parser("Convert the legacy sumstats_db.rds fixture to new-pipeline MASH inputs")
+parser <- add_argument(parser, "--legacy-rds",
+                       help = "Path to the legacy protocol_example.sumstats_db.rds",
+                       type = "character")
+parser <- add_argument(parser, "--plink2-prefix",
+                       help = "PLINK2 pgen/pvar/psam prefix for the LD sketch",
+                       type = "character")
+parser <- add_argument(parser, "--output-dir",
+                       help = "Output directory",
+                       type = "character")
+argv <- parse_args(parser)
+
+if (!file.exists(argv$legacy_rds))
+  stop("--legacy-rds file not found: ", argv$legacy_rds)
+for (ext in c(".pgen", ".pvar", ".psam")) {
+  if (!file.exists(paste0(argv$plink2_prefix, ext)))
+    stop("PLINK2 file missing: ", paste0(argv$plink2_prefix, ext))
+}
+dir.create(argv$output_dir, showWarnings = FALSE, recursive = TRUE)
+
+db <- readRDS(argv$legacy_rds)
+if (!is.list(db) || length(db) == 0L)
+  stop("Legacy RDS is not a non-empty list (got ", class(db)[[1L]],
+       ", length ", length(db), ").")
+
+conditions <- names(db)
+if (any(nchar(conditions) == 0L))
+  stop("Every condition in the legacy RDS must be named.")
+
+# ----- Pass 1: enumerate regions present across all conditions ----------
+parseRegion <- function(s) {
+  m <- regmatches(s, regexec("^([^:]+):([0-9]+)-([0-9]+)$", s))[[1L]]
+  if (length(m) != 4L)
+    stop("Cannot parse legacy region label '", s,
+         "'; expected 'chr:start-end'.")
+  list(chr = m[[2L]], start = as.integer(m[[3L]]),
+       end = as.integer(m[[4L]]))
+}
+
+regionSet <- list()
+for (cond in conditions) {
+  for (rid in names(db[[cond]])) {
+    if (is.null(regionSet[[rid]])) regionSet[[rid]] <- parseRegion(rid)
+  }
+}
+if (length(regionSet) == 0L)
+  stop("No regions found across any condition in the legacy RDS.")
+
+# Pre-compute per-condition variant_id chromosomes so we can group
+# per-chrom TSV writes. Variant IDs are of the form 'chr22:15528227:A:G'.
+chromOf <- function(vids) {
+  sub("^([^:]+):.*", "\\1", as.character(vids))
+}
+
+# ----- Pass 2: write per-condition per-chrom TSVs -----------------------
+sumListPaths <- character(length(conditions))
+names(sumListPaths) <- conditions
+for (cond in conditions) {
+  # Concatenate every region of this condition into a single data frame
+  # (variant_id, z), then split by chromosome and write one TSV per chrom.
+  perRegion <- db[[cond]]
+  pieces <- list()
+  for (rid in names(perRegion)) {
+    rd <- perRegion[[rid]]
+    if (is.null(rd$variant_names) || is.null(rd$sumstats) ||
+        is.null(rd$sumstats$z)) {
+      warning("Skipping region '", rid, "' in condition '", cond,
+              "': missing variant_names or sumstats$z.")
+      next
+    }
+    vids <- as.character(rd$variant_names)
+    z    <- as.numeric(rd$sumstats$z)
+    if (length(vids) != length(z))
+      stop("Length mismatch in condition '", cond,
+           "', region '", rid, "': ", length(vids), " variants vs ",
+           length(z), " z-scores.")
+    pieces[[length(pieces) + 1L]] <- data.frame(
+      variant_id = vids, z = z, stringsAsFactors = FALSE)
+  }
+  if (length(pieces) == 0L) {
+    warning("Condition '", cond, "' produced no rows; skipping.")
+    sumListPaths[[cond]] <- ""
+    next
+  }
+  df <- do.call(rbind, pieces)
+  df <- df[!duplicated(df$variant_id), , drop = FALSE]
+
+  chroms <- chromOf(df$variant_id)
+  uchroms <- sort(unique(chroms))
+  perCondPaths <- character(length(uchroms))
+  names(perCondPaths) <- uchroms
+  for (chr in uchroms) {
+    sub <- df[chroms == chr, , drop = FALSE]
+    # Embed the bare chromosome number in the filename (matching the
+    # notebook's `.{chr_number}.` matching convention).
+    chrNum <- sub("^chr", "", chr)
+    outPath <- file.path(argv$output_dir,
+                         sprintf("%s.%s.tsv.gz", cond, chrNum))
+    gz <- gzfile(outPath, "w")
+    write.table(sub, file = gz, sep = "\t", quote = FALSE,
+                row.names = FALSE, na = "")
+    close(gz)
+    perCondPaths[[chr]] <- outPath
+  }
+  sumListPath <- file.path(argv$output_dir,
+                           sprintf("%s.sum_list.txt", cond))
+  writeLines(perCondPaths, sumListPath)
+  sumListPaths[[cond]] <- sumListPath
+  cat(sprintf("Wrote %d TSV(s) for condition '%s' (%d total variants)\n",
+              length(uchroms), cond, nrow(df)))
+}
+
+# ----- Region TSV -------------------------------------------------------
+regionsTsv <- file.path(argv$output_dir, "regions.tsv")
+regionRows <- do.call(rbind, lapply(names(regionSet), function(rid) {
+  r <- regionSet[[rid]]
+  # Synthesise a stable gene_id from the region label so the notebook can
+  # use it as a per-region filename token.
+  geneId <- gsub("[:\\-]", "_", rid)
+  data.frame(chr = r$chr, start = r$start, end = r$end,
+             gene_id = geneId, stringsAsFactors = FALSE)
+}))
+write.table(regionRows, file = regionsTsv, sep = "\t",
+            quote = FALSE, row.names = FALSE)
+cat(sprintf("Wrote regions TSV (%d rows) to %s\n",
+            nrow(regionRows), regionsTsv))
+
+# ----- LD-sketch GenotypeHandle ----------------------------------------
+ldHandle <- GenotypeHandle(plink2Prefix = argv$plink2_prefix)
+ldOut <- file.path(argv$output_dir, "ld_sketch.rds")
+saveRDS(ldHandle, ldOut)
+cat(sprintf("Wrote GenotypeHandle (path '%s') to %s\n",
+            argv$plink2_prefix, ldOut))
+
+# Print a tiny manifest summary so the user knows what to pass downstream.
+cat("\n--- Conversion summary ---\n")
+cat("output dir:        ", argv$output_dir, "\n", sep = "")
+cat("conditions:        ", paste(conditions, collapse = ", "), "\n", sep = "")
+cat("--conditions arg:  ", paste(conditions, collapse = ","), "\n", sep = "")
+cat("--sum-files args:  ", paste(unname(sumListPaths), collapse = " "), "\n", sep = "")
+cat("--region-file arg: ", regionsTsv, "\n", sep = "")
+cat("--ld-sketch arg:   ", ldOut, "\n", sep = "")
diff --git a/code/script/pecotmr_integration/mash.R b/code/script/pecotmr_integration/mash.R
new file mode 100644
index 00000000..975466cc
--- /dev/null
+++ b/code/script/pecotmr_integration/mash.R
@@ -0,0 +1,241 @@
+#!/usr/bin/env Rscript
+# mash.R
+#
+# Estimate the MASH mixture-component covariance + weights via
+# pecotmr::mashPipeline(). Consumes the per-region RDSes produced by
+# mash_sumstats_construct.R (one per LD-block region), partitions into
+# strong / random / null subsets via pecotmr::mashRandNullSample, wraps
+# each partition as a per-context QtlSumStats collection (with a
+# pass-through QC record), and calls mashPipeline.
+#
+# Inputs:
+#   --mash-inputs f1.rds [f2.rds ...]   Per-region RDSes (each is
+#                                        list(region_id = list(z, region))).
+#   --study                              Study label for the synthesised
+#                                        QtlSumStats. Default "study".
+#   --ld-sketch <RDS>                    GenotypeHandle RDS to embed in
+#                                        the synthesised QtlSumStats
+#                                        (required by mashPipeline).
+#   --n-random / --n-null                Random / null subset sizes
+#                                        passed to mashRandNullSample.
+#                                        Defaults 4000 / 4000.
+#   --exclude-condition c1,c2,...        Conditions to drop from the
+#                                        per-context QtlSumStats before
+#                                        running MASH. Default none.
+#   --alpha                              alpha argument to mashPipeline().
+#                                        Default 0 (standard scale).
+#   --seed                               RNG seed. Default 999.
+#   --vhat-rds <RDS>                     Optional pre-computed residual
+#                                        correlation matrix (vhat). When
+#                                        supplied, replaces
+#                                        estimate_null_correlation_simple()
+#                                        inside mashPipeline; the wrapper
+#                                        readRDS()s the file and passes
+#                                        the matrix as an in-memory R
+#                                        object (no I/O inside pecotmr).
+#   --prior-rds <RDS>                    Optional pre-computed prior
+#                                        covariance matrices. The RDS is
+#                                        expected to contain either a
+#                                        named list of square matrices
+#                                        directly, OR a list with a `$U`
+#                                        slot holding that list (the
+#                                        legacy `protocol_example.EE.prior.rds`
+#                                        shape). The wrapper extracts and
+#                                        forwards the U list to
+#                                        mashPipeline's `priorCovariances`
+#                                        argument.
+#   --output <RDS>                       MASH result RDS (U + w + meta).
+
+suppressPackageStartupMessages({
+  library(argparser)
+  library(pecotmr)
+  library(GenomicRanges)
+  library(IRanges)
+  library(S4Vectors)
+})
+
+parser <- arg_parser("Run pecotmr::mashPipeline on per-region MASH inputs")
+parser <- add_argument(parser, "--mash-inputs",
+                       help = "Per-region MASH input RDSes",
+                       type = "character", nargs = Inf)
+parser <- add_argument(parser, "--study",
+                       help = "Study label for the synthesised QtlSumStats",
+                       type = "character", default = "study")
+parser <- add_argument(parser, "--ld-sketch",
+                       help = "GenotypeHandle RDS (required by mashPipeline's QC gate)",
+                       type = "character", default = "")
+parser <- add_argument(parser, "--n-random",
+                       help = "Random subset size for mashRandNullSample",
+                       type = "integer", default = 4000L)
+parser <- add_argument(parser, "--n-null",
+                       help = "Null subset size for mashRandNullSample",
+                       type = "integer", default = 4000L)
+parser <- add_argument(parser, "--exclude-condition",
+                       help = "Comma-separated conditions to drop",
+                       type = "character", default = "")
+parser <- add_argument(parser, "--alpha",
+                       help = "alpha argument to mashPipeline()",
+                       type = "numeric", default = 0)
+parser <- add_argument(parser, "--seed",
+                       help = "RNG seed", type = "integer", default = 999L)
+parser <- add_argument(parser, "--vhat-rds",
+                       help = "Optional pre-computed residual correlation matrix RDS",
+                       type = "character", default = "")
+parser <- add_argument(parser, "--prior-rds",
+                       help = "Optional pre-computed prior-covariance RDS (matrix list or list with $U)",
+                       type = "character", default = "")
+parser <- add_argument(parser, "--output",
+                       help = "Output MASH-result RDS path",
+                       type = "character")
+argv <- parse_args(parser)
+
+# Load the optional pre-computed mash artefacts at the I/O boundary so
+# the pecotmr-side call stays purely in-memory.
+vhat_obj <- NULL
+if (nzchar(argv$vhat_rds)) {
+  if (!file.exists(argv$vhat_rds))
+    stop("--vhat-rds file not found: ", argv$vhat_rds)
+  vhat_obj <- readRDS(argv$vhat_rds)
+  if (!is.matrix(vhat_obj) || !is.numeric(vhat_obj))
+    stop("--vhat-rds must deserialise to a numeric matrix (got '",
+         class(vhat_obj)[[1L]], "').")
+  if (nrow(vhat_obj) != ncol(vhat_obj))
+    stop("--vhat-rds matrix must be square; got ",
+         nrow(vhat_obj), " x ", ncol(vhat_obj), ".")
+}
+prior_U <- NULL
+if (nzchar(argv$prior_rds)) {
+  if (!file.exists(argv$prior_rds))
+    stop("--prior-rds file not found: ", argv$prior_rds)
+  raw <- readRDS(argv$prior_rds)
+  # Accept either a named list of matrices OR a list with $U.
+  prior_U <- if (is.list(raw) && !is.null(raw$U)) raw$U else raw
+  if (!is.list(prior_U) || length(prior_U) == 0L ||
+      is.null(names(prior_U)) || any(names(prior_U) == ""))
+    stop("--prior-rds must hold (or have a $U slot containing) a non-empty ",
+         "named list of square matrices.")
+}
+
+inputs <- as.character(argv$mash_inputs)
+if (length(inputs) == 0L)
+  stop("--mash-inputs requires at least one per-region RDS.")
+if (!nzchar(argv$ld_sketch) || !file.exists(argv$ld_sketch))
+  stop("--ld-sketch is required and must point at a GenotypeHandle RDS ",
+       "(mashPipeline gates on QC'd sumstats, which carry the ldSketch).")
+ld_handle <- readRDS(argv$ld_sketch)
+if (!methods::is(ld_handle, "GenotypeHandle"))
+  stop("--ld-sketch must deserialise to a GenotypeHandle (got '",
+       class(ld_handle)[[1L]], "').")
+
+exclude <- if (nzchar(argv$exclude_condition)) {
+  trimws(strsplit(argv$exclude_condition, ",", fixed = TRUE)[[1L]])
+} else character(0)
+
+# Load and concatenate per-region inputs. Each entry is
+# list(region_id = list(z = matrix, region = chr:start-end))
+dat <- list()
+for (path in inputs) {
+  x <- readRDS(path)
+  if (!is.list(x) || length(x) == 0L)
+    stop(path, " is not a non-empty list (expected per-region MASH input).")
+  for (rid in names(x)) {
+    if (rid %in% names(dat))
+      stop("Duplicate region_id '", rid, "' across --mash-inputs (",
+           path, "); regions must be unique.")
+    dat[[rid]] <- x[[rid]]
+  }
+}
+
+# Build a single concatenated `dat` for mashRandNullSample. We require
+# every region to share the same condition set (so column-binding the
+# z-matrices stays meaningful for MASH).
+conditions <- colnames(dat[[1L]]$z)
+for (rid in names(dat)) {
+  if (!identical(colnames(dat[[rid]]$z), conditions))
+    stop("Region '", rid,
+         "' has a different condition set than the first region '",
+         names(dat)[[1L]],
+         "'; mashRandNullSample requires a common set of conditions.")
+}
+zStack <- do.call(rbind, lapply(dat, function(x) x$z))
+cat(sprintf("Stacked %d region(s) -> %d variants x %d conditions\n",
+            length(dat), nrow(zStack), ncol(zStack)))
+
+# Partition into strong / random / null. Strong is the input itself
+# (max(|z|) >= threshold variants are downstream-filtered by mash);
+# random / null are the subsets mashRandNullSample emits.
+partition <- mashRandNullSample(
+  list(z = zStack),
+  nRandom          = argv$n_random,
+  nNull            = argv$n_null,
+  excludeCondition = exclude,
+  seed             = argv$seed)
+randomZ <- partition$random$z
+nullZ   <- partition$null$z
+if (is.null(randomZ) || nrow(randomZ) == 0L)
+  stop("mashRandNullSample returned an empty random subset; ",
+       "increase --n-random or check the input z-matrix.")
+strongZ <- zStack
+# Drop excluded conditions if mashRandNullSample applied them.
+if (length(exclude) > 0L) {
+  keepCols <- setdiff(colnames(strongZ), exclude)
+  strongZ <- strongZ[, keepCols, drop = FALSE]
+}
+
+# Wrap each partition as a QtlSumStats collection: one row per
+# (study, context, trait) with a single GRanges entry of the partition's
+# variants. Use synthetic positions when variant IDs don't parse as
+# "chr:pos:..." (mashPipeline only reads Z out of mcols).
+.toQss <- function(zMat, role) {
+  vids <- rownames(zMat)
+  if (is.null(vids)) vids <- paste0("var", seq_len(nrow(zMat)))
+  # Try chr:pos:... decode; otherwise synthesise chr1 positions.
+  m <- regmatches(vids, regexec("^([^:_]+)[:_]([0-9]+)", vids))
+  chrom <- vapply(m, function(x) if (length(x) >= 2L) x[[2L]] else "chr1",
+                  character(1L))
+  pos   <- suppressWarnings(vapply(m,
+            function(x) if (length(x) >= 3L) as.integer(x[[3L]]) else NA_integer_,
+            integer(1L)))
+  pos[is.na(pos)] <- seq_along(pos)[is.na(pos)]
+  # One per-context entry.
+  entries <- lapply(seq_len(ncol(zMat)), function(j) {
+    gr <- GRanges(seqnames = chrom,
+                  ranges   = IRanges(start = pos, width = 1L))
+    mcols(gr) <- DataFrame(
+      SNP = vids, A1 = rep("A", length(vids)), A2 = rep("G", length(vids)),
+      Z   = as.numeric(zMat[, j]),
+      N   = rep(1000L, length(vids)))
+    gr
+  })
+  ctxs <- colnames(zMat)
+  qss <- QtlSumStats(
+    study    = rep(argv$study, ncol(zMat)),
+    context  = ctxs,
+    trait    = rep("mash", ncol(zMat)),
+    entry    = entries,
+    genome   = "GRCh38",
+    ldSketch = ld_handle,
+    qcInfo   = list(role = role,
+                    entryAudit = vector("list", ncol(zMat))))
+  qss
+}
+
+sumStatsList <- list(strong = .toQss(strongZ, "strong"),
+                     random = .toQss(randomZ, "random"))
+if (!is.null(nullZ) && nrow(nullZ) > 0L)
+  sumStatsList$null <- .toQss(nullZ, "null")
+
+cat(sprintf("Built sumStatsList: strong=%d, random=%d, null=%s\n",
+            nrow(strongZ), nrow(randomZ),
+            if (is.null(nullZ)) "(none)" else as.character(nrow(nullZ))))
+
+res <- mashPipeline(
+  sumStatsList        = sumStatsList,
+  alpha               = argv$alpha,
+  residualCorrelation = vhat_obj,
+  priorCovariances    = prior_U,
+  setSeed             = argv$seed)
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+saveRDS(res, argv$output, compress = "xz")
+cat(sprintf("Wrote MASH result (U + w) to %s\n", argv$output))
diff --git a/code/script/pecotmr_integration/mash_manifest.R b/code/script/pecotmr_integration/mash_manifest.R
new file mode 100644
index 00000000..6850b28d
--- /dev/null
+++ b/code/script/pecotmr_integration/mash_manifest.R
@@ -0,0 +1,122 @@
+#!/usr/bin/env Rscript
+# mash_manifest.R
+#
+# Resolve `mash_preprocessing.ipynb`'s --region-file + --sum-files + --conditions
+# into a single per-region manifest TSV. Each row carries the chrom-matched
+# per-condition tensorqtl paths so the downstream `[mash_inputs_from_tensorqtl]`
+# step can fan out one task per region without any Python parsing.
+#
+# Inputs:
+#   --region-file <TSV>          Region list with columns `chr start end gene_id`.
+#                                Header rows (start/end not numeric) are skipped.
+#   --sum-files <txt> [<txt>...] One text file per condition; each line lists
+#                                one tensorqtl summary-stats path. The file
+#                                names embed the chromosome (legacy `.CHR.`
+#                                convention) so we can pick the right path
+#                                per region.
+#   --conditions c1,c2,...       Condition labels matching --sum-files order.
+#   --output <TSV>               Output manifest TSV path.
+#
+# Output TSV columns:
+#   gene_id, chr, start, end, region, tensorqtl_paths
+# where `tensorqtl_paths` is a space-separated list (one per condition, same
+# order as --conditions) of the per-condition tensorqtl files for this region.
+
+suppressPackageStartupMessages({
+  library(argparser)
+})
+
+parser <- arg_parser("Build a per-region manifest TSV for mash_preprocessing.ipynb")
+parser <- add_argument(parser, "--region-file",
+                       help = "Region list TSV (chr/start/end/gene_id)",
+                       type = "character")
+parser <- add_argument(parser, "--sum-files",
+                       help = "One text file per condition",
+                       type = "character", nargs = Inf)
+parser <- add_argument(parser, "--conditions",
+                       help = "Comma-separated condition labels (one per --sum-files)",
+                       type = "character")
+parser <- add_argument(parser, "--output",
+                       help = "Output manifest TSV path",
+                       type = "character")
+argv <- parse_args(parser)
+
+if (!file.exists(argv$region_file))
+  stop("--region-file not found: ", argv$region_file)
+sumFiles <- as.character(argv$sum_files)
+conditions <- trimws(strsplit(argv$conditions, ",", fixed = TRUE)[[1L]])
+if (length(sumFiles) != length(conditions))
+  stop("--sum-files length (", length(sumFiles), ") must match --conditions ",
+       "length (", length(conditions), ").")
+for (sf in sumFiles) if (!file.exists(sf))
+  stop("--sum-files entry not found: ", sf)
+
+# Parse the region file (skip headers + non-numeric rows).
+parseRegions <- function(path) {
+  lines <- readLines(path)
+  out <- list()
+  for (l in lines) {
+    l <- trimws(l)
+    if (!nzchar(l) || startsWith(l, "#")) next
+    parts <- strsplit(l, "\\s+")[[1L]]
+    if (length(parts) < 4L) next
+    sNum <- suppressWarnings(as.integer(parts[[2L]]))
+    eNum <- suppressWarnings(as.integer(parts[[3L]]))
+    if (is.na(sNum) || is.na(eNum)) next
+    chr <- if (startsWith(parts[[1L]], "chr")) parts[[1L]] else paste0("chr", parts[[1L]])
+    out[[length(out) + 1L]] <- data.frame(
+      chr = chr, start = sNum, end = eNum,
+      gene_id = parts[[4L]], stringsAsFactors = FALSE)
+  }
+  if (length(out) == 0L)
+    stop("No data rows parsed from --region-file ", path)
+  do.call(rbind, out)
+}
+regions <- parseRegions(argv$region_file)
+
+# For each sum_files entry, read every line once and keep a per-chrom lookup
+# (legacy `.CHR.` naming convention).
+matchPerChrom <- function(sumFile) {
+  lines <- readLines(sumFile)
+  perChrom <- list()
+  for (line in lines) {
+    line <- trimws(line)
+    if (!nzchar(line)) next
+    m <- regmatches(line, regexec("\\.([0-9XYM]+)\\.", line))[[1L]]
+    if (length(m) < 2L) next
+    chrNum <- m[[2L]]
+    perChrom[[chrNum]] <- c(perChrom[[chrNum]], line)
+  }
+  perChrom
+}
+perCondLookup <- lapply(sumFiles, matchPerChrom)
+
+rows <- list()
+for (i in seq_len(nrow(regions))) {
+  chrNum <- sub("^chr", "", regions$chr[[i]])
+  paths <- character(length(conditions))
+  for (j in seq_along(conditions)) {
+    hits <- perCondLookup[[j]][[chrNum]]
+    if (is.null(hits) || length(hits) == 0L)
+      stop("No path matching .", chrNum, ". in ", sumFiles[[j]],
+           " (needed for condition '", conditions[[j]],
+           "', region ", regions$chr[[i]], ":", regions$start[[i]],
+           "-", regions$end[[i]], ").")
+    paths[[j]] <- hits[[1L]]
+  }
+  rows[[length(rows) + 1L]] <- data.frame(
+    gene_id         = regions$gene_id[[i]],
+    chr             = regions$chr[[i]],
+    start           = regions$start[[i]],
+    end             = regions$end[[i]],
+    region          = sprintf("%s:%d-%d", regions$chr[[i]],
+                               regions$start[[i]], regions$end[[i]]),
+    tensorqtl_paths = paste(paths, collapse = " "),
+    stringsAsFactors = FALSE)
+}
+out <- do.call(rbind, rows)
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+write.table(out, file = argv$output, sep = "\t", quote = FALSE,
+            row.names = FALSE, na = "")
+cat(sprintf("Wrote MASH manifest with %d row(s) (%d regions x %d conditions) to %s\n",
+            nrow(out), nrow(regions), length(conditions), argv$output))
diff --git a/code/script/pecotmr_integration/mash_sumstats_construct.R b/code/script/pecotmr_integration/mash_sumstats_construct.R
new file mode 100644
index 00000000..f241f655
--- /dev/null
+++ b/code/script/pecotmr_integration/mash_sumstats_construct.R
@@ -0,0 +1,101 @@
+#!/usr/bin/env Rscript
+# mash_sumstats_construct.R
+#
+# Per-region MASH input builder. Reads one or more multi-context tensorqtl
+# summary-statistics files (one per condition / trait combination) for a
+# single LD-block region and writes a per-region RDS in the legacy "dat"
+# shape that mash.R consumes:
+#
+#   list(<region_id> = list(z = <n_variants x n_conditions matrix>))
+#
+# The downstream mash.R worker concatenates per-region RDSes, partitions
+# into strong / random / null subsets via pecotmr::mashRandNullSample,
+# wraps each partition as a QtlSumStats, runs summaryStatsQc, and calls
+# pecotmr::mashPipeline.
+#
+# Inputs:
+#   --tensorqtl-paths file1 [file2 ...]   Per-condition tensorqtl
+#                                          summary-stats files for THIS
+#                                          region. Each file must carry
+#                                          a `variant_id` column and
+#                                          either `z` or `tstat`.
+#   --conditions c1,c2,...                Condition labels, same order
+#                                          and length as --tensorqtl-paths.
+#   --region chr:start-end                Genomic interval label (used
+#                                          as the region_id key in the
+#                                          output RDS).
+#   --output <RDS>                        Output path.
+
+suppressPackageStartupMessages({
+  library(argparser)
+})
+
+parser <- arg_parser("Build a per-region MASH input RDS from per-condition tensorqtl outputs")
+parser <- add_argument(parser, "--tensorqtl-paths",
+                       help = "Per-condition tensorqtl sumstat files",
+                       type = "character", nargs = Inf)
+parser <- add_argument(parser, "--conditions",
+                       help = "Comma-separated condition labels (one per path)",
+                       type = "character")
+parser <- add_argument(parser, "--region",
+                       help = "Genomic interval as chr:start-end (used as region_id)",
+                       type = "character")
+parser <- add_argument(parser, "--output",
+                       help = "Output RDS path", type = "character")
+argv <- parse_args(parser)
+
+paths      <- as.character(argv$tensorqtl_paths)
+conditions <- trimws(strsplit(argv$conditions, ",", fixed = TRUE)[[1L]])
+if (length(paths) == 0L)
+  stop("--tensorqtl-paths requires at least one path.")
+if (length(paths) != length(conditions))
+  stop("--tensorqtl-paths length (", length(paths),
+       ") must match --conditions length (", length(conditions), ").")
+
+# Read one file → return data.frame(variant_id, z)
+read_one <- function(path) {
+  open_fn <- if (grepl("\\.gz$", path)) function(p) gzfile(p)
+             else function(p) p
+  df <- read.table(open_fn(path), header = TRUE, sep = "\t",
+                   stringsAsFactors = FALSE, check.names = FALSE,
+                   comment.char = "")
+  pick <- function(opts) intersect(opts, names(df))[1L]
+  vid <- pick(c("variant_id", "SNP", "snp", "rsid"))
+  zcol <- pick(c("z", "Z"))
+  tcol <- pick(c("tstat", "tstatistic", "tstat_val"))
+  if (is.na(vid))
+    stop(path, " is missing a variant_id column (looked for variant_id / SNP).")
+  if (is.na(zcol) && is.na(tcol))
+    stop(path, " has neither z nor tstat column.")
+  z <- if (!is.na(zcol)) as.numeric(df[[zcol]]) else as.numeric(df[[tcol]])
+  data.frame(variant_id = as.character(df[[vid]]),
+             z = z, stringsAsFactors = FALSE)
+}
+
+# Read every condition, then inner-join on variant_id so the z-matrix is
+# rectangular (variants × conditions). This mirrors what the legacy
+# `load_multitrait_*_sumstat` helpers did before they were retired.
+perCondition <- lapply(paths, read_one)
+common <- Reduce(intersect, lapply(perCondition, `[[`, "variant_id"))
+if (length(common) == 0L)
+  stop("No variant overlaps across the supplied --tensorqtl-paths for region ",
+       argv$region, ".")
+zMat <- matrix(NA_real_, nrow = length(common), ncol = length(conditions),
+               dimnames = list(common, conditions))
+for (i in seq_along(perCondition)) {
+  d <- perCondition[[i]]
+  idx <- match(common, d$variant_id)
+  zMat[, i] <- d$z[idx]
+}
+zMat <- zMat[stats::complete.cases(zMat), , drop = FALSE]
+if (nrow(zMat) == 0L)
+  stop("No variants left after dropping rows with NA in any condition for region ",
+       argv$region, ".")
+
+out <- list()
+out[[argv$region]] <- list(z = zMat, region = argv$region)
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+saveRDS(out, argv$output, compress = "xz")
+cat(sprintf("Wrote MASH input for region %s: %d variants x %d conditions to %s\n",
+            argv$region, nrow(zMat), ncol(zMat), argv$output))
diff --git a/code/script/pecotmr_integration/twas_manifest_validate.R b/code/script/pecotmr_integration/twas_manifest_validate.R
new file mode 100644
index 00000000..db77d54f
--- /dev/null
+++ b/code/script/pecotmr_integration/twas_manifest_validate.R
@@ -0,0 +1,81 @@
+#!/usr/bin/env Rscript
+# twas_manifest_validate.R
+#
+# Validate the per-gene TWAS manifest used by twas.ipynb and emit a
+# canonicalised copy (header has no leading '#'; required columns
+# present; gene_id values unique; optional fine_mapping_result_rds
+# column normalised). Downstream SoS steps fan out over the canonical
+# copy via csv.DictReader without needing any Python parsing logic.
+#
+# Inputs:
+#   --manifest <TSV>  Per-gene manifest with at least the columns
+#                      `gene_id`, `twas_weights_rds`, `gwas_sumstats_rds`
+#                      (and optionally `fine_mapping_result_rds`).
+#                      A leading '#' on the header line is tolerated
+#                      and stripped.
+#   --output <TSV>    Canonical manifest TSV path.
+
+suppressPackageStartupMessages({
+  library(argparser)
+})
+
+parser <- arg_parser("Validate + canonicalise the per-gene TWAS manifest")
+parser <- add_argument(parser, "--manifest",
+                       help = "Per-gene TWAS manifest TSV",
+                       type = "character")
+parser <- add_argument(parser, "--output",
+                       help = "Canonical manifest TSV path",
+                       type = "character")
+argv <- parse_args(parser)
+
+if (!file.exists(argv$manifest))
+  stop("--manifest file not found: ", argv$manifest)
+
+lines <- readLines(argv$manifest)
+if (length(lines) == 0L)
+  stop("Empty manifest: ", argv$manifest)
+
+# Tolerate (and strip) a leading '#' on the header line.
+hdr <- strsplit(sub("^#", "", lines[[1L]]), "\t", fixed = TRUE)[[1L]]
+required <- c("gene_id", "twas_weights_rds", "gwas_sumstats_rds")
+missing <- setdiff(required, hdr)
+if (length(missing) > 0L)
+  stop("Manifest missing required column(s): ",
+       paste(missing, collapse = ", "),
+       " (header was: ", paste(hdr, collapse = ", "), ").")
+hasFmr <- "fine_mapping_result_rds" %in% hdr
+
+rows <- list()
+for (line in lines[-1L]) {
+  if (!nzchar(line)) next
+  parts <- strsplit(line, "\t", fixed = TRUE)[[1L]]
+  # Pad short rows with empty strings so column alignment stays sane.
+  if (length(parts) < length(hdr))
+    parts <- c(parts, rep("", length(hdr) - length(parts)))
+  row <- setNames(as.list(parts[seq_along(hdr)]), hdr)
+  if (!nzchar(row$gene_id)) next
+  rows[[length(rows) + 1L]] <- row
+}
+if (length(rows) == 0L)
+  stop("Manifest has no data rows: ", argv$manifest)
+
+geneIds <- vapply(rows, `[[`, character(1), "gene_id")
+if (anyDuplicated(geneIds))
+  stop("Manifest has duplicate gene_id values: ",
+       paste(unique(geneIds[duplicated(geneIds)]), collapse = ", "))
+
+outCols <- c("gene_id", "twas_weights_rds", "gwas_sumstats_rds",
+             if (hasFmr) "fine_mapping_result_rds" else NULL)
+outDf <- do.call(rbind, lapply(rows, function(r) {
+  as.data.frame(setNames(lapply(outCols, function(k) {
+    v <- r[[k]]; if (is.null(v)) "" else as.character(v)
+  }), outCols), stringsAsFactors = FALSE)
+}))
+
+dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)
+write.table(outDf, file = argv$output, sep = "\t", quote = FALSE,
+            row.names = FALSE, na = "")
+cat(sprintf("Canonicalised manifest (%d rows, %s) to %s\n",
+            nrow(outDf),
+            if (hasFmr) "with fine_mapping_result_rds"
+            else "no fine_mapping_result_rds column", argv$output))
diff --git a/code/script/pecotmr_integration/twas_weights.R b/code/script/pecotmr_integration/twas_weights.R
index 4fe72878..618d1b61 100644
--- a/code/script/pecotmr_integration/twas_weights.R
+++ b/code/script/pecotmr_integration/twas_weights.R
@@ -15,10 +15,21 @@
 #   --gene-id               (gene mode) trait identifier
 #   --cis-window            (gene mode) cis-window in bp
 #   --region                (region mode) chr:start-end string
+#   --methods               Comma-separated method tokens (default
+#                           "default"; "default" expands inside pecotmr).
+#                           Pass e.g. "mrmash,mvsusie" for multivariate
+#                           TWAS weights.
 #   --fine-mapping-result   Optional pre-fit FineMappingResult RDS;
 #                           SuSiE-family methods reuse the cached fits
 #                           via the fineMappingResult cache. Pass "" or
 #                           "." to skip.
+#   --method-args           Optional JSON object of per-method kwargs
+#                           spliced into twasWeightsPipeline() via its
+#                           named-list methods= argument. Keys are
+#                           method tokens (must be a subset of --methods);
+#                           values are kwargs lists forwarded to the
+#                           underlying per-method learner. Example:
+#                           '{"lasso":{"nfolds":10}}'.
 #   --output                Output RDS path (one TwasWeights)
 
 suppressPackageStartupMessages({
@@ -26,6 +37,7 @@ suppressPackageStartupMessages({
   library(pecotmr)
   library(GenomicRanges)
   library(IRanges)
+  library(jsonlite)
 })
 
 parser <- arg_parser("Per-gene or per-region default-preset TWAS weights over a pre-built QtlDataset")
@@ -41,13 +53,64 @@ parser <- add_argument(parser, "--cis-window",
 parser <- add_argument(parser, "--region",
                        help = "Genomic region as chr:start-end (region mode); mutually exclusive with --gene-id",
                        type = "character", default = "")
+parser <- add_argument(parser, "--methods",
+                       help = "Comma-separated TWAS method tokens (default 'default')",
+                       type = "character", default = "default")
 parser <- add_argument(parser, "--fine-mapping-result",
                        help = "Optional pre-fit FineMappingResult RDS",
                        type = "character", default = "")
+parser <- add_argument(parser, "--method-args",
+                       help = "JSON object {token: {kwarg: value, ...}, ...} for twasWeightsPipeline()",
+                       type = "character", default = "")
 parser <- add_argument(parser, "--output",
                        help = "Output RDS path", type = "character")
 argv <- parse_args(parser)
 
+# Parse --method-args into a nested named list of per-method kwargs.
+parsed_method_args <- if (nzchar(argv$method_args) && argv$method_args != "." &&
+                          argv$method_args != "{}") {
+  parsed <- tryCatch(jsonlite::fromJSON(argv$method_args, simplifyVector = FALSE),
+                     error = function(e) stop(
+                       "--method-args must be a JSON object string (got: ",
+                       argv$method_args, "). Error: ", conditionMessage(e)))
+  if (!is.list(parsed) || is.null(names(parsed)) || any(names(parsed) == ""))
+    stop("--method-args must be a JSON object whose keys are method ",
+         "tokens, e.g. '{\"lasso\":{\"nfolds\":10}}'.")
+  nonObject <- vapply(parsed, function(x) !is.list(x), logical(1))
+  if (any(nonObject))
+    stop("--method-args: each value must itself be an object of kwargs ",
+         "(got non-object for: ",
+         paste(names(parsed)[nonObject], collapse = ", "), ").")
+  parsed
+} else {
+  NULL
+}
+
+# Normalize --methods into the form twasWeightsPipeline expects: a
+# character vector of method tokens, or "default" (the preset). When
+# --method-args is supplied, build the named-list form {token: kwargs}
+# that twasWeightsPipeline accepts directly. The "default" preset can't
+# carry per-method kwargs (it expands inside pecotmr to a fixed token
+# set we can't see here) — explicit --methods is required in that case.
+methods <- trimws(strsplit(argv$methods, ",", fixed = TRUE)[[1L]])
+methods_arg <- if (is.null(parsed_method_args)) {
+  if (length(methods) == 1L && methods == "default") "default" else methods
+} else {
+  if (length(methods) == 1L && methods == "default")
+    stop("--method-args is only supported with explicit --methods (got ",
+         "--methods 'default'); list the method tokens you want to tune ",
+         "explicitly, e.g. --methods lasso,enet --method-args '{\"lasso\":{\"nfolds\":10}}'.")
+  unknown <- setdiff(names(parsed_method_args), methods)
+  if (length(unknown) > 0L)
+    stop("--method-args has keys not listed in --methods (got '",
+         paste(unknown, collapse = ", "),
+         "'; --methods = '", paste(methods, collapse = ", "), "').")
+  setNames(lapply(methods, function(tk) {
+    if (tk %in% names(parsed_method_args)) parsed_method_args[[tk]]
+    else list()
+  }), methods)
+}
+
 parse_region <- function(s) {
   m <- regmatches(s, regexec("^([^:]+):([0-9]+)-([0-9]+)$", s))[[1L]]
   if (length(m) != 4L)
@@ -74,19 +137,15 @@ fmr <- if (nzchar(fmr_path) && fmr_path != "." && file.exists(fmr_path)) {
 }
 
 res <- if (has_region) {
-  twasWeightsPipeline(
-    qd,
-    methods           = NULL,
-    region            = parse_region(argv$region),
-    cisWindow         = argv$cis_window,
-    fineMappingResult = fmr)
+  twasWeightsPipeline(qd, methods = methods_arg,
+                      region    = parse_region(argv$region),
+                      cisWindow = argv$cis_window,
+                      fineMappingResult = fmr)
 } else {
-  twasWeightsPipeline(
-    qd,
-    methods           = NULL,
-    traitId           = argv$gene_id,
-    cisWindow         = argv$cis_window,
-    fineMappingResult = fmr)
+  twasWeightsPipeline(qd, methods = methods_arg,
+                      traitId   = argv$gene_id,
+                      cisWindow = argv$cis_window,
+                      fineMappingResult = fmr)
 }
 
 dir.create(dirname(argv$output), showWarnings = FALSE, recursive = TRUE)