Skip to content

Utilize PHRED-scores for error removal, add 3D json compatibility#40

Merged
evolp merged 44 commits into
masterfrom
dev
May 4, 2026
Merged

Utilize PHRED-scores for error removal, add 3D json compatibility#40
evolp merged 44 commits into
masterfrom
dev

Conversation

@evolp
Copy link
Copy Markdown
Collaborator

@evolp evolp commented Mar 20, 2026

Utilize PHRED-scores for error removal

  • the quality scores from the sequencing reads can be stored in a Reads
  • the scores are binned into four categories in accordance with this document on Illumina NovaSeq Quality Scores.
  • it can be used to assess the validity of each k-mer during graph construction
    • option 1: only include k-mers which have occurred with all bases having at least a specific quality
    • option 2: only include edges of k-mer occurrences with all bases having at least a specific quality, unless this would disconnect the k-mer on one or both sides
  • the quality scores can also be saved in the graph nodes by using the SummaryData implementations IDMapEMQualityData,SumMapEMQualityData, or TagsCountsPEMQualityData - this can be used to apply two error removal algorithms to the graph
    • DebruijnGraph::remove_lq_ladders_tips will remove any ladders (bubbles) and tips which are not supported by good quality k-mers.
    • DebruijnGraph::remove_lq_splits is more radical and will identify nodes at which the path splits into a good and bad quality path, or at which a good and bad quality path merge, and remove the connection to the bad quality path

3D JSON compatibility

  • a DebruijnGraph can now be saved to a JSON file compatible with https://github.com/vasturiano/3d-force-graph
  • default functions for node and edge format, which include all node information, are available: Node::node_json_default and Node::edge_json_default

evolp added 30 commits February 19, 2026 19:15
Feat/to 3d json

- add `DebruijnGraph::to_json_3d` and node and edge default formatting methods
- remove necessity for serialized test graphs in tests
@evolp evolp requested review from sjanssen2 and tensulin March 20, 2026 12:33
@evolp evolp linked an issue Mar 20, 2026 that may be closed by this pull request
Comment thread src/summarizer.rs Outdated
Comment thread src/summarizer.rs Outdated
Comment thread src/summarizer.rs Outdated
Comment thread src/summarizer.rs Outdated
Comment thread src/summarizer.rs Outdated
@evolp evolp merged commit 32f2ffd into master May 4, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move test graph generation to this crate

2 participants