fix(global-standards): Enable multiple precursors for the same reference peptide #180

tonywu1999 · 2026-01-13T23:04:08Z

Motivation and Context

Was encountering an error at the line

standard = input[PEPTIDE == peptide_name, ]

If a reference peptide has multiple precursors, then this code will fail. I also noticed the for loop associated with this code is also heavily inefficient and can be vectorized

Changes

Enable a reference peptide to have multiple precursors
Vectorize for loop for aggregating data for reference peptides

Testing

Added unit tests verifying basic global reference normalization functionality
Unit tests also ensure that a reference peptide can have multiple precursors.

Checklist Before Requesting a Review

I have read the MSstats contributing guidelines
My changes generate no new warnings
Any dependent changes have been merged and published in downstream modules
I have run the devtools::document() command after my changes and committed the added files

Summary by CodeRabbit

New Features
- Normalization now accepts an explicit standards parameter for data-driven standard handling.
Bug Fixes
- Improved validation with descriptive error messages for missing standards.
- More robust, consolidated calculation of standard means to ensure consistent normalization.
Tests
- Added tests covering global standards normalization scenarios to ensure correctness.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…nce peptide

coderabbitai · 2026-01-13T23:04:18Z

Warning

Rate limit exceeded

@tonywu1999 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 55 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Refactors global-standards normalization to a data-driven aggregation: joins input with peptide dictionary, assigns a new standard column, validates presence of requested standards (errors if missing), and computes mean abundances via grouped aggregation. Adds tests for normalization behavior.

Changes

Cohort / File(s)	Summary
Core normalization logic `R/utils_normalize.R`	Replaced iterative per-standard loop with a data-driven pipeline: merge input with `peptides_dict`, assign `standard` (PeptideSequence or PROTEIN), validate requested `standards` (error on missing), compute mean_abundance by `RUN` and `standard`, reshape and merge results. Also updated `MSstatsNormalize` signature to accept `standards = NULL`.
Test suite `inst/tinytest/test_utils_normalize.R`	Added new tests and helpers: builds peptide dictionary and synthetic input, and two tests validating normalization outcomes for differing group intensities and alternating intensities within fractions (invokes `MSstats:::.normalizeGlobalStandards`).

Sequence Diagram(s)

sequenceDiagram
  participant Caller as "MSstatsNormalize"
  participant NG as ".normalizeGlobalStandards"
  participant Dict as "peptides_dict"
  participant Data as "input data"
  participant Agg as "Aggregation (mean_abundance)"

  Caller->>NG: call with input, peptides_dict, standards
  NG->>Data: merge with Dict to form input_with_peptides
  NG->>Dict: lookup standards -> identify matches/missing
  alt missing standards found
    NG-->>Caller: throw error (list missing standards)
  else all standards present
    NG->>Agg: group by RUN & standard, compute mean_abundance
    Agg-->>NG: mean_abundance table (wide -> long reshape)
    NG-->>Caller: return normalized data (merged results)
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰✨ I hopped through rows and joined each line,
Found standards hiding, named them fine,
I averaged runs with careful cheer,
And when some went missing, I spoke clear —
Tests clap paws: the rabbits dance, data's here! 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: enabling multiple precursors for reference peptides in global standards normalization.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check	✅ Passed	The pull request description covers motivation, changes, testing, and completes the checklist, aligning well with the template structure and requirements.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-13T23:05:01Z

Failed to generate code suggestions for PR

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

R/utils_normalize.R (1)
236-240: Duplicate log messages in both branches.

Both the single-fraction and multi-fraction cases emit identical messages. Compare with .normalizeQuantile (lines 124-128), which correctly differentiates between them.
📝 Proposed fix
     if (data.table::uniqueN(input$FRACTION) == 1L) {
         msg = "Normalization : normalization with global standards protein - okay"
     } else {
-        msg = "Normalization : normalization with global standards protein - okay"
+        msg = "Normalization : normalization with global standards protein per fraction - okay"
     }

🧹 Nitpick comments (1)

R/utils_normalize.R (1)
197-203: Dead code: means_by_standard initialization is overwritten.

Line 197 initializes means_by_standard = unique(input[, list(RUN)]), but this value is never used—it's completely replaced at lines 215-221 by the aggregation result. This leftover from the previous implementation should be removed.
🧹 Remove dead code
     proteins = as.character(unique(input$PROTEIN))
-    means_by_standard = unique(input[, list(RUN)])
     input_with_peptides <- merge(input, peptides_dict, by = "PEPTIDE", all.x = TRUE)

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

R/utils_normalize.R (1)
235-239: Duplicate log messages in conditional branches.

Both branches produce the identical message, making the condition ineffective. If different messages were intended for single vs. multiple fractions, update accordingly.
🔧 Suggested fix (example differentiation)
     if (data.table::uniqueN(input$FRACTION) == 1L) {
         msg = "Normalization : normalization with global standards protein - okay"
     } else {
-        msg = "Normalization : normalization with global standards protein - okay"
+        msg = "Normalization : normalization with global standards protein per fraction - okay"
     }

🧹 Nitpick comments (1)

R/utils_normalize.R (1)
196-196: Unused variable proteins.

The variable proteins is assigned but never referenced in this function. Since the commit mentions "remove dead code," consider removing this line.
🔧 Suggested fix
-    proteins = as.character(unique(input$PROTEIN))

tonywu1999 added 2 commits January 13, 2026 10:45

fix(global-standards): Enable multiple precursors for the same refere…

71cc526

…nce peptide

make normalize by global standards more efficient

e4975eb

tonywu1999 added 4 commits January 14, 2026 13:05

tests: add initial mock data for unit tests

fbc8879

modify test dataset

c34ca74

finalize unit tests for global reference normalization

6daccd1

enable multiple precursors for same peptide in unit tests

e8ec406

tonywu1999 marked this pull request as ready for review January 29, 2026 20:08

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

remove dead code

a8cf7d7

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

remove more dead code

94865cc

tonywu1999 merged commit de817e1 into devel Jan 29, 2026
2 checks passed

tonywu1999 deleted the fix-standards branch January 29, 2026 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(global-standards): Enable multiple precursors for the same reference peptide #180

fix(global-standards): Enable multiple precursors for the same reference peptide #180

Uh oh!

tonywu1999 commented Jan 13, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 13, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(global-standards): Enable multiple precursors for the same reference peptide #180

fix(global-standards): Enable multiple precursors for the same reference peptide #180

Uh oh!

Conversation

tonywu1999 commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Changes

Testing

Checklist Before Requesting a Review

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tonywu1999 commented Jan 13, 2026 •

edited

Loading

coderabbitai bot commented Jan 13, 2026 •

edited

Loading