-
Notifications
You must be signed in to change notification settings - Fork 1
[feat] itk_combine to combine multiple label sources #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| # itk_combine | ||
|
|
||
| Combine multiple label folders by intersecting filenames and merging labels according to ordered mapping rules. This tool is useful when you have multiple specialized segmentations for the same cases and want to create a unified label map. | ||
|
|
||
| ## Usage | ||
|
|
||
| ```bash | ||
| itk_combine --source <name>=<folder> --map <mapping_rule> <dest_folder> [options] | ||
| ``` | ||
|
|
||
| ## Parameters | ||
|
|
||
| - `--source`: Specify a label source in the format `name=/path/to/folder`. Can be specified multiple times for different sources. | ||
| - `--map`: Specify a mapping rule in the format `<source_name>:<source_labels>-><target_label>`. | ||
| - `<source_name>` must match one of the names defined in `--source`. | ||
| - `<source_labels>` can be a single integer or a comma-separated list of integers. | ||
| - Multiple `--map` rules are allowed. **Priority is determined by order**: the first rule that matches a voxel determines its value in the output. | ||
| - `dest_folder`: Destination folder for the combined label files. | ||
| - `--mp`: Enable multiprocessing. | ||
| - `--workers`: Number of worker processes (defaults to half of CPU cores). | ||
|
|
||
| ## Mapping Priority and Logic | ||
|
|
||
| 1. **Intersection**: Only files that exist in **all** specified source folders (with the same base name) will be processed. | ||
|
|
||
| 2. **Validation**: For each file, the tool ensures that the image size and spacing are identical across all sources. If a mismatch is found, the process will fail. | ||
|
|
||
| 3. **Merging**: | ||
|
|
||
| - The output label map is initialized to 0 (Background). | ||
| - Rules are applied sequentially in the order they appear in the command line. | ||
| - Once a voxel is assigned a non-zero value, it will not be overwritten by subsequent rules. This allows for clear priority management between overlapping sources. | ||
|
|
||
| ## Example | ||
|
|
||
| Suppose you have: | ||
|
|
||
| - `Source A`: Organ segmentations (1: Liver, 2: Spleen) | ||
| - `Source B`: Tumor segmentations (1: Liver Tumor) | ||
|
|
||
| To combine them into a single map where Background=0, Liver=1, Spleen=2, and Liver Tumor=3 (with tumor taking priority over the organ label): | ||
|
|
||
| ```bash | ||
| itk_combine \ | ||
| --source organs=/path/to/organs \ | ||
| --source tumors=/path/to/tumors \ | ||
| --map tumors:1->3 \ | ||
| --map organs:1->1 \ | ||
| --map organs:2->2 \ | ||
| /path/to/combined_output \ | ||
| --mp | ||
| ``` | ||
|
|
||
| ## Output | ||
|
|
||
| - Combined label maps (normalized to `.mha` format and `uint8` data type). | ||
| - `meta.json`: Standard ITKIT metadata file containing size, spacing, origin, and unique classes for each combined file. |
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,214 @@ | ||||||||||||||
| import argparse | ||||||||||||||
| import os | ||||||||||||||
| from dataclasses import dataclass | ||||||||||||||
| from pathlib import Path | ||||||||||||||
|
|
||||||||||||||
| import numpy as np | ||||||||||||||
| import SimpleITK as sitk | ||||||||||||||
|
|
||||||||||||||
| from itkit.process.base_processor import BaseITKProcessor | ||||||||||||||
| from itkit.process.metadata_models import SeriesMetadata | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| @dataclass(frozen=True) | ||||||||||||||
| class SourceSpec: | ||||||||||||||
| name: str | ||||||||||||||
| folder: Path | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| @dataclass(frozen=True) | ||||||||||||||
| class MappingRule: | ||||||||||||||
| source_name: str | ||||||||||||||
| source_labels: tuple[int, ...] | ||||||||||||||
| target_label: int | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| def _parse_sources(source_args: list[str]) -> list[SourceSpec]: | ||||||||||||||
| sources: list[SourceSpec] = [] | ||||||||||||||
| seen_names: set[str] = set() | ||||||||||||||
| for item in source_args: | ||||||||||||||
| if "=" not in item: | ||||||||||||||
| raise ValueError(f"Invalid source format: {item}. Expected name=/path/to/labels") | ||||||||||||||
| name, folder = item.split("=", 1) | ||||||||||||||
| name = name.strip() | ||||||||||||||
| if not name: | ||||||||||||||
| raise ValueError(f"Invalid source name in: {item}") | ||||||||||||||
| if name in seen_names: | ||||||||||||||
| raise ValueError(f"Duplicate source name: {name}") | ||||||||||||||
| folder_path = Path(folder).expanduser().resolve() | ||||||||||||||
| if not folder_path.exists() or not folder_path.is_dir(): | ||||||||||||||
| raise ValueError(f"Source folder not found: {folder_path}") | ||||||||||||||
| sources.append(SourceSpec(name=name, folder=folder_path)) | ||||||||||||||
| seen_names.add(name) | ||||||||||||||
| return sources | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| def _parse_mapping_rule(rule: str) -> MappingRule: | ||||||||||||||
| if "->" not in rule or ":" not in rule: | ||||||||||||||
| raise ValueError(f"Invalid mapping rule: {rule}. Expected <source>:<src_labels>-><target>") | ||||||||||||||
| left, target_str = rule.split("->", 1) | ||||||||||||||
| source_name, labels_str = left.split(":", 1) | ||||||||||||||
| source_name = source_name.strip() | ||||||||||||||
| labels_str = labels_str.strip() | ||||||||||||||
| target_str = target_str.strip() | ||||||||||||||
| if not source_name or not labels_str or not target_str: | ||||||||||||||
| raise ValueError(f"Invalid mapping rule: {rule}. Expected <source>:<src_labels>-><target>") | ||||||||||||||
|
|
||||||||||||||
| try: | ||||||||||||||
| target_label = int(target_str) | ||||||||||||||
| except ValueError as exc: | ||||||||||||||
| raise ValueError(f"Invalid target label in rule: {rule}") from exc | ||||||||||||||
|
|
||||||||||||||
| label_parts = [p.strip() for p in labels_str.split(",") if p.strip()] | ||||||||||||||
| if not label_parts: | ||||||||||||||
| raise ValueError(f"No source labels specified in rule: {rule}") | ||||||||||||||
|
|
||||||||||||||
| source_labels: list[int] = [] | ||||||||||||||
| for part in label_parts: | ||||||||||||||
| try: | ||||||||||||||
| source_labels.append(int(part)) | ||||||||||||||
| except ValueError as exc: | ||||||||||||||
| raise ValueError(f"Invalid source label '{part}' in rule: {rule}") from exc | ||||||||||||||
|
|
||||||||||||||
| return MappingRule(source_name=source_name, source_labels=tuple(source_labels), target_label=target_label) | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| class CombineProcessor(BaseITKProcessor): | ||||||||||||||
| def __init__( | ||||||||||||||
| self, | ||||||||||||||
| sources: list[SourceSpec], | ||||||||||||||
| dest_folder: Path, | ||||||||||||||
| mapping_rules: list[MappingRule], | ||||||||||||||
| mp: bool = False, | ||||||||||||||
| workers: int | None = None, | ||||||||||||||
| ): | ||||||||||||||
| super().__init__(task_description="Combining labels", mp=mp, workers=workers) | ||||||||||||||
| self.sources = sources | ||||||||||||||
| self.dest_folder = dest_folder | ||||||||||||||
| self.mapping_rules = mapping_rules | ||||||||||||||
| self.source_index = {src.name: idx for idx, src in enumerate(self.sources)} | ||||||||||||||
|
|
||||||||||||||
| def get_items_to_process(self) -> list[tuple[str, list[str]]]: | ||||||||||||||
| source_files: dict[str, dict[str, str]] = {} | ||||||||||||||
| for src in self.sources: | ||||||||||||||
| files = {p.name: str(p) for p in src.folder.glob("*.mha")} | ||||||||||||||
|
||||||||||||||
| files = {p.name: str(p) for p in src.folder.glob("*.mha")} | |
| files: dict[str, str] = {} | |
| for ext in self.SUPPORTED_EXTENSIONS: | |
| pattern = f"*{ext}" if ext.startswith(".") else f"*.{ext}" | |
| for p in src.folder.glob(pattern): | |
| files[p.name] = str(p) |
Copilot
AI
Jan 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for empty rules is redundant because the --map argument is marked as required=True in the argument parser. This validation can never be reached since argparse will fail earlier if no --map arguments are provided. Consider removing this redundant check.
| if not rules: | |
| raise ValueError("At least one mapping rule is required.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message text is inconsistent with the expected format described in the error message itself. The message says "Expected name=/path/to/labels" but the parsing splits on "=" to get name and folder. Consider changing the message to "Expected name=/path/to/folder" to match the actual parameter description in the documentation and help text.