Skip to content

Add optional alternative prediction bigWig and combine-aggregation for benchmark#3

Merged
XiaotingChen merged 2 commits intopeak_based_benchmarkfrom
codex/modify-benchmark-function-for-alternative-bw-file
Mar 23, 2026
Merged

Add optional alternative prediction bigWig and combine-aggregation for benchmark#3
XiaotingChen merged 2 commits intopeak_based_benchmarkfrom
codex/modify-benchmark-function-for-alternative-bw-file

Conversation

@XiaotingChen
Copy link
Owner

Motivation

  • Enable benchmarking to accept a second prediction bigWig and combine per-bin values from both prediction files before computing downstream metrics, so users can supply an alternate prediction track and choose how to aggregate overlapping signals.

Description

  • Added CLI flags --alternative_prediction and --prediction_combine_operation (mean or max) to the benchmark subcommand in maxatac/utilities/parser.py and updated the run_benchmarking entrypoint to log and pass these options to downstream code.
  • Implemented import_bigwig_stats_array, combine_prediction_arrays, and extended import_prediction_array_fn in maxatac/utilities/genome_tools.py to load stats preserving NaNs and to combine primary/alternative arrays with per-bin mean or max semantics while falling back to available values.
  • Updated calculate_R2_pearson_spearman and ChromosomeAUPRC in maxatac/utilities/benchmarking_tools.py to accept an optional alternative prediction stream and the combine operation and to use the new import_prediction_array_fn so the combined vector is used for all downstream metrics.
  • Updated maxatac/analyses/benchmark.py to pass the new arguments into both quantitative and binary benchmarking flows and added documentation in docs/readme/benchmark.md with an example and behavior description.
  • Added focused tests in tests/test_benchmark.py that verify combine_prediction_arrays behavior for mean and max and a parser smoke test for the new CLI options.

Testing

  • Ran pytest -q tests/test_benchmark.py which produced 2 passed, 1 skipped with warnings.
  • Compiled updated modules with python -m compileall maxatac/analyses/benchmark.py maxatac/utilities/benchmarking_tools.py maxatac/utilities/genome_tools.py maxatac/utilities/parser.py tests/test_benchmark.py which completed successfully.

Codex Task

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d56e244d2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@XiaotingChen
Copy link
Owner Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7d84e9dcf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +344 to +348
alternative_prediction_array = import_bigwig_stats_array(alternative_prediction_stream,
chromosome,
chromosome_length,
agg_function,
bin_count)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle absent chromosomes in the alternative bigWig

If the optional --alternative_prediction file is missing one of the benchmarked chromosomes (for example because it uses 1/chr1 naming or omits chrM), this new code still calls stats() on that chromosome and pyBigWig raises, aborting the whole benchmark. That breaks the advertised fallback behavior of using the primary prediction when only one track has data, so the alternative stream needs a chromosome-presence check before querying it.

Useful? React with 👍 / 👎.

Copy link
Owner Author

@XiaotingChen XiaotingChen Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant