fix(gain): use weighted savings rate in per-command stats by FlorianBruniaux · Pull Request #891 · rtk-ai/rtk

FlorianBruniaux · 2026-03-28T00:36:44Z

Summary

Replace AVG(savings_pct) with SUM(saved_tokens)*100.0/SUM(input_tokens) in the per-command SQL query — the old unweighted mean diluted high-volume commands (grep showed 14.8% instead of the real 98.6%)
Rename Avg% column header to Rate (more accurate)
Add doc warning on CommandStats type alias against using AVG() on savings_pct
Add regression test with intentionally skewed data proving weighted != unweighted

Root cause

Each invocation stores its own savings_pct. AVG() treated a 10-token call identically to a 10M-token call. Small calls at 0% savings (passthroughs, short greps) massively diluted the average — grep had 184M tokens saved out of 193M input, yet showed 14.8% instead of 95%.

Credit

Hat tip to Mathieu Grenier who spotted the inconsistency on LinkedIn: grep at 184M saved on 193M input showing 14.8% while global efficiency was 98%. Good instinct, real bug.

Test plan

cargo test --all passes (1135 tests)
rtk gain shows Rate >= 90% for grep/read (was 14-43%)
Global efficiency % unchanged (already correct, uses a different calculation)
Regression test fix: improve command robustness and flag support #14 asserts rate > 90% on skewed data (10% small call + 95% large call)

Add checkout/merge/rebase to git pattern, run/publish to cargo pattern, and build/exec to pnpm pattern in PATTERNS regex. Mark each as RtkStatus::Passthrough in subcmd_status so discover correctly classifies them as "already handled via passthrough" instead of "TOP UNHANDLED". Also update test_registry_covers_all_git/cargo_subcommands to include the new subcommands, and add 11 focused tests for classify + rewrite. Impact: 1277 commands (git checkout 493, git merge 154, git rebase 137, pnpm build 171, pnpm exec 173, cargo run 78, cargo publish 71) will appear in "MISSED SAVINGS" instead of "TOP UNHANDLED COMMANDS". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

Adds a one-liner listing all supported ecosystems (ruff, pytest, pip, golangci-lint, etc.) in the Architecture section to satisfy the pre-push documentation validation script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

AVG(savings_pct) produced a simple unweighted mean that diluted the reported rate for high-volume commands — grep showed 14.8% while the true weighted rate was 98.6%, a 84-point gap. Replace AVG(savings_pct) with SUM(saved_tokens)*100.0/SUM(input_tokens) in the per-command SQL query so the rate reflects actual byte savings rather than an average of per-invocation percentages. Also: - Rename "Avg%" column header to "Rate" (more accurate label) - Add doc warning on CommandStats type alias against using AVG() - Add regression test proving weighted != unweighted on skewed data Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

Copilot

Pull request overview

This PR corrects rtk gain per-command savings to use a weighted savings rate (based on total tokens) rather than an unweighted average of per-invocation percentages, improving accuracy for high-volume commands.

Changes:

Update the per-command SQL aggregation to compute SUM(saved_tokens) * 100 / SUM(input_tokens) and rename the UI column header from Avg% to Rate.
Add documentation warning on the CommandStats tuple semantics and add a regression test proving weighted vs unweighted behavior.
Expand discover command classification coverage for additional git/cargo/pnpm subcommands (passthrough routing) and document supported ecosystems.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/core/tracking.rs	Switch per-command savings from `AVG(savings_pct)` to a weighted rate; document semantics; add regression test.
src/analytics/gain.rs	Update per-command table header from `Avg%` to `Rate` to reflect weighted rate semantics.
src/discover/rules.rs	Extend discover patterns/rules to recognize more git/cargo/pnpm subcommands and mark them passthrough.
src/discover/registry.rs	Update/extend tests ensuring registry covers the expanded subcommand set and passthrough classification/rewrites.
CLAUDE.md	Add a one-line list of supported ecosystems.

Copilot · 2026-03-28T00:41:20Z

src/core/tracking.rs

+/// # Warning
+/// The 4th field is a **weighted** savings rate: `SUM(saved_tokens) * 100.0 / SUM(input_tokens)`.
+/// Do NOT aggregate this column with `AVG()` — that would produce an unweighted mean that
+/// under-weights high-volume commands. Always use `SUM(saved) / SUM(input)` instead.


The warning mixes “percentage” and “ratio” formulas: earlier it defines the field as SUM(saved_tokens) * 100.0 / SUM(input_tokens), but the last sentence says to use SUM(saved) / SUM(input) (missing the * 100.0 and column names). Please make the recommended aggregation formula consistent and explicitly in percent form to avoid copy/paste mistakes.

Suggested change

/// under-weights high-volume commands. Always use `SUM(saved) / SUM(input)` instead.

/// under-weights high-volume commands. Always recompute it as

/// `SUM(saved_tokens) * 100.0 / SUM(input_tokens)` instead.

Copilot · 2026-03-28T00:41:21Z

src/core/tracking.rs

+
+        let (_name, _count, _saved, rate, _time) = entry;
+
+        // Weighted rate = (10 + 95_000) * 100.0 / (100 + 100_000) ≈ 94.99%


The regression test comment’s expected weighted value looks off: (10 + 95_000) * 100 / (100 + 100_000) is ~94.9%, not ~94.99%. Consider correcting the comment (or compute the expected rate and assert within a small epsilon) so the test documents the math accurately.

Suggested change

// Weighted rate = (10 + 95_000) * 100.0 / (100 + 100_000) ≈ 94.99%

// Weighted rate = (10 + 95_000) * 100.0 / (100 + 100_000) ≈ 94.9%

Exact final value is 94.9150849150849150 so yeah much more near 94.9 than 94.99

Copilot · 2026-03-28T00:41:21Z

CLAUDE.md

+Supported ecosystems: git/gh/gt, cargo, go/golangci-lint, npm/pnpm/npx, ruff/pytest/pip/mypy, rspec/rubocop/rake, dotnet, playwright/vitest/jest, docker/kubectl/aws.
+


PR description is centered on gain’s weighted savings calculation, but this change also updates general repository guidance (supported ecosystems list). Please either mention this documentation update in the PR description or move it to a separate doc-only PR to keep the change set focused.

Copilot · 2026-03-28T00:41:21Z

src/discover/rules.rs

+    r"^git\s+(?:-[Cc]\s+\S+\s+)*(status|log|diff|show|add|commit|push|pull|branch|fetch|stash|worktree|checkout|merge|rebase)",
    r"^gh\s+(pr|issue|run|repo|api|release)",
-    r"^cargo\s+(build|test|clippy|check|fmt|install)",
-    r"^pnpm\s+(list|ls|outdated|install)",
+    r"^cargo\s+(build|test|clippy|check|fmt|install|run|publish)",
+    r"^pnpm\s+(list|ls|outdated|install|build|exec)",


PR description focuses on weighted savings rate in gain, but this hunk also expands Discover’s supported command patterns/statuses (git checkout/merge/rebase, cargo run/publish, pnpm build/exec). Please update the PR description to include this scope change (or split into a separate PR) so reviewers can evaluate the intent/risk appropriately.

KuSh

LGTM!

KuSh · 2026-03-28T09:18:00Z

src/core/tracking.rs

+/// Type alias for command statistics tuple: (command, count, saved_tokens, weighted_savings_rate, avg_time_ms)
+///
+/// # Warning
+/// The 4th field is a **weighted** savings rate: `SUM(saved_tokens) * 100.0 / SUM(input_tokens)`.


Suggested change

/// The 4th field is a **weighted** savings rate: `SUM(saved_tokens) * 100.0 / SUM(input_tokens)`.

/// The 4th field is a **weighted** savings rate: `SUM(saved_tokens) / SUM(input_tokens) * 100.0 .

Seems more idiomatic to express a percentage

KuSh · 2026-03-28T09:28:52Z

src/core/tracking.rs

+
+        let (_name, _count, _saved, rate, _time) = entry;
+
+        // Weighted rate = (10 + 95_000) * 100.0 / (100 + 100_000) ≈ 94.99%


Exact final value is 94.9150849150849150 so yeah much more near 94.9 than 94.99

KuSh · 2026-03-28T09:31:22Z

src/discover/rules.rs

 // Patterns ordered to match RULES indices exactly.
 pub const PATTERNS: &[&str] = &[
-    r"^git\s+(?:-[Cc]\s+\S+\s+)*(status|log|diff|show|add|commit|push|pull|branch|fetch|stash|worktree)",
+    r"^git\s+(?:-[Cc]\s+\S+\s+)*(status|log|diff|show|add|commit|push|pull|branch|fetch|stash|worktree|checkout|merge|rebase)",


Using alphabetically sorted list would ease maintenance

Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

FlorianBruniaux and others added 3 commits March 28, 2026 01:06

Copilot AI review requested due to automatic review settings March 28, 2026 00:36

Copilot started reviewing on behalf of FlorianBruniaux March 28, 2026 00:37 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

KuSh approved these changes Mar 28, 2026

View reviewed changes

docs(changelog): add entry for gain weighted savings rate fix

5de6ab2

Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gain): use weighted savings rate in per-command stats#891

fix(gain): use weighted savings rate in per-command stats#891
FlorianBruniaux wants to merge 4 commits intodevelopfrom
fix/gain-weighted-avg

FlorianBruniaux commented Mar 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

KuSh Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

KuSh left a comment

Uh oh!

KuSh Mar 28, 2026

Uh oh!

KuSh Mar 28, 2026

Uh oh!

KuSh Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	/// under-weights high-volume commands. Always use `SUM(saved) / SUM(input)` instead.
	/// under-weights high-volume commands. Always recompute it as
	/// `SUM(saved_tokens) * 100.0 / SUM(input_tokens)` instead.


		let (_name, _count, _saved, rate, _time) = entry;

		// Weighted rate = (10 + 95_000) * 100.0 / (100 + 100_000) ≈ 94.99%

		Supported ecosystems: git/gh/gt, cargo, go/golangci-lint, npm/pnpm/npx, ruff/pytest/pip/mypy, rspec/rubocop/rake, dotnet, playwright/vitest/jest, docker/kubectl/aws.

	/// The 4th field is a weighted savings rate: `SUM(saved_tokens) * 100.0 / SUM(input_tokens)`.
	/// The 4th field is a weighted savings rate: `SUM(saved_tokens) / SUM(input_tokens) * 100.0 .

Conversation

FlorianBruniaux commented Mar 28, 2026

Summary

Root cause

Credit

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

KuSh Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

KuSh left a comment

Choose a reason for hiding this comment

Uh oh!

KuSh Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

KuSh Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

KuSh Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants