Merged
Conversation
…e count the minor allele
Contributor
There was a problem hiding this comment.
Pull request overview
This PR modularizes genotype I/O, adds experimental PLINK binary support, and introduces packaging/distribution improvements (Docker image and crates.io workflow), along with crate-style documentation comments.
Changes:
- Extracts VCF/BCF logic into a dedicated
xcfmodule and centralizes shared I/O utilities inio, updating the CLI to work with a more generalINPUTargument and optional PLINK mode. - Adds a new
plinkmodule to read BED/BIM/FAM files intoGenoData, plus README updates describing PLINK usage and behavior. - Introduces a multi-stage Docker build for
xpclrs, bumps the crate version to1.0.0, and adds GitHub Actions workflows for Docker image publishing and (intended) crates.io publishing.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
src/xcf/mod.rs |
New module that encapsulates indexed and streamed VCF/BCF reading into GenoData, with options for phasing, recombination rate, and multi-threading. |
src/plink/mod.rs |
New PLINK BED/BIM/FAM reader that maps binary genotypes to GenoData, applies sample/position filters, and mirrors the XP-CLR filtering logic across two populations. |
src/io/mod.rs |
Refactors shared I/O utilities (e.g., GenoData, gt2gcount, sample list helpers) and provides high-level process_xcf / process_plink, plus documented read_file, write_table, and to_table. |
src/methods/mod.rs |
Adds crate-style docs and slightly refactors compute_complikelihood invocation while keeping the XP-CLR likelihood and windowing logic unchanged functionally. |
src/main.rs |
Updates CLI to use a generic INPUT arg, wires in --plink to choose between process_xcf and process_plink, and propagates new options (start as Option<u64>, n_threads) into I/O. |
src/lib.rs |
Exposes the new plink and xcf modules as part of the public crate API. |
README.md |
Documents PLINK binary support, clarifies how PLINK sample IDs are constructed (FID_IID), and explains how genetic distances are derived for PLINK input. |
Dockerfile |
Adds a multi-stage Docker build: compile xpclrs in a build stage, then copy the release binary into a minimal Ubuntu runtime image. |
Cargo.toml / Cargo.lock |
Bumps version to 1.0.0 and adds a short crate description to prepare for crates.io publishing. |
.github/workflows/docker.yml |
New CI workflow that builds and pushes multi-arch Docker images to Docker Hub on pushes to main and on releases. |
.github/workflows/crates.yml |
New workflow intended to publish to crates.io on releases, logging in with a registry token and invoking cargo publish (currently only in dry-run mode). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR does the following: