SNMCT-seq Pipeline (Junxi Version)

This repository contains a fully modular, Slurm-compatible SNMCT-seq processing pipeline customized for the Jin Lab / Luo Lab environment at UCSD–Scripps.
All steps are configured using a single your_path_setups.config file and run through a master launcher run_pipeline.sh.

📁 Directory Structure

Your pipeline directory should look like:

LuoLab_Pipeline_Custom_junxi/
│
├── run_pipeline.sh
├── your_path_setups.config
├── your_Scripts/
│   ├── step1_prepare_genome_for_bismark.sub
│   ├── step1_prepare_genome_for_star.sub
│   ├── step2_demultiplex.sub
│   ├── step3_trimming.sub
│   ├── step4_dna_alignment.sub
│   ├── step4_rna_alignment.sub
│   ├── step5_combine_summary.sub
│   ├── step6_gRNA_assignment.sub
│   └── step7_pseudobulk_merge.sub
│
└── metadata/
       ├── plate_S01.xlsx
       └── plate_S02.xlsx

⚙️ Configuration File (`your_path_setups.config`)

This file defines all input/output directories, references, modules, and metadata:

Example:

# project folders
DIR_PROJ=/mnt/jin/group/junxi/snmctseq_cassie/snmct_seq_mbd2output

# raw FASTQs
FASTQ_ROOT=/mnt/jin/group/cassie/Cassie/251205_Novaseq/CP_fastq_files

# reference files
REF_DIR=/mnt/jin/group/reference/mouse_gencode_vM38
REF_FASTA=${REF_DIR}/GRCm39.primary_assembly.genome.fa
REF_GTF=${REF_DIR}/gencode.vM38.primary_assembly.annotation.gtf

RUN_GENOME_PREP=false

# STAR index
STAR_INDEX=${REF_DIR}/STAR149

# pipeline scripts
PIPELINE_DIR=/gpfs/home/junxif/xin_lab/LuoLab_Pipeline_Custom_junxi

# metadata folder
METADATA_DIR=${PIPELINE_DIR}/metadata
RATIO_CUTOFF=2.0

🧬 Metadata Format (gRNA Assignment)

Each plate must have one Excel file in metadata/ with the name:

plate_S01.xlsx
plate_S02.xlsx

Format:

WELL	Dnmt1_g1	Dnmt1_g2	Mbd2_g1	Safe_g1	Safe_g2
A1	0	513	6	0	0
A10	0	7	4	0	0

The pipeline will automatically:

detect plate names
load all metadata files
merge them
label wells as D1, ST, or Ambiguous

🚀 Running the Pipeline

Use:

sbatch run_pipeline.sh

The pipeline:

Optionally prepares genome indices (if RUN_GENOME_PREP=true)
Demultiplexes FASTQs
Trims reads
Aligns DNA (Bismark)
Aligns RNA (STAR)
Generates combined QC summary
Assigns gRNAs
Produces pseudobulk BAMs per condition (D1 vs ST)

You will see outputs in:

${DIR_PROJ}/demultiplexed_fastq
${DIR_PROJ}/trimmed_fastq
${DIR_PROJ}/bismark_alignment
${DIR_PROJ}/star_alignment
${DIR_PROJ}/combined_summary
${DIR_PROJ}/gRNA_assignments
${DIR_PROJ}/pseudobulk_bams

📊 Logging

All logs are written to:

your_job_logs/

One log per Slurm step
One master log from the run_pipeline.sh job

❗ Important Notes

1. Do NOT hardcode paths inside scripts.

Everything must come from the config file.

2. Do NOT use `#SBATCH --chdir=`

All scripts rely on absolute paths and explicitly cd into the correct working directory.

3. The pipeline supports:

Single plate
Two plates
Any number of plates matching plate_S*.xlsx

4. Dependencies ensure:

If any step fails → all downstream steps auto-cancel

✅ Summary

This pipeline is:

Fully modular
Automatically plate-aware
Supports dynamic metadata
Safe on Slurm clusters
End-to-end for SNMCT-seq (DNA + RNA)

If you'd like, I can generate a versioned release, PDF manual, or flowchart diagram.

🧪 Contact

Pipeline author: Junxi Feng
For issues: Ask ChatGPT 😉

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Notebooks		Notebooks
Scripts		Scripts
metadata		metadata
your_Scripts		your_Scripts
.gitignore		.gitignore
README.md		README.md
run_pipeline.sh		run_pipeline.sh
snmCT_parameters.env		snmCT_parameters.env
your_path_setups.config		your_path_setups.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SNMCT-seq Pipeline (Junxi Version)

📁 Directory Structure

⚙️ Configuration File (`your_path_setups.config`)

🧬 Metadata Format (gRNA Assignment)

🚀 Running the Pipeline

📊 Logging

❗ Important Notes

1. Do NOT hardcode paths inside scripts.

2. Do NOT use `#SBATCH --chdir=`

3. The pipeline supports:

4. Dependencies ensure:

✅ Summary

🧪 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SNMCT-seq Pipeline (Junxi Version)

📁 Directory Structure

⚙️ Configuration File (your_path_setups.config)

🧬 Metadata Format (gRNA Assignment)

🚀 Running the Pipeline

📊 Logging

❗ Important Notes

1. Do NOT hardcode paths inside scripts.

2. Do NOT use #SBATCH --chdir=

3. The pipeline supports:

4. Dependencies ensure:

✅ Summary

🧪 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

⚙️ Configuration File (`your_path_setups.config`)

2. Do NOT use `#SBATCH --chdir=`

Packages