Skip to content

[codex] fix forked session double-counting and optimize loading#910

Open
jackcpku wants to merge 2 commits intoryoppippi:mainfrom
jackcpku:codex/optimize-codex-session-loading
Open

[codex] fix forked session double-counting and optimize loading#910
jackcpku wants to merge 2 commits intoryoppippi:mainfrom
jackcpku:codex/optimize-codex-session-loading

Conversation

@jackcpku
Copy link
Copy Markdown
Contributor

@jackcpku jackcpku commented Mar 27, 2026

Why this PR matters

  • Fixes a correctness bug: forked Codex sessions can replay an ancestor token_count prefix, and the previous loader counted that replay again. That inflated usage totals for fork-heavy workflows such as subagents.
  • Makes large Codex histories much more practical to analyze by switching session loading to a streaming path with earlier filtering and online aggregation.

Headline impact

Area Before After
Forked session accounting replayed ancestor usage could be counted again replayed ancestor prefixes are ignored
Large-history reliability daily could appear hung and use multi-GB RSS queries complete reliably with much lower peak memory
Narrow date filters full-history scan first, filter later file overlap is checked before parsing

Benchmark impact

Summary: on the same one-day subset benchmark, this PR cuts peak RSS by about 55% at the cost of higher wall time. The main win is correctness for forked sessions plus avoiding multi-GB behavior on larger histories.

Scenario Before After Improvement
Exact benchmark requested: CODEX_HOME=/Users/jack/tmp/codex-home-one-day node dist/index.js daily --offline --since 20260327 --until 20260327 >/dev/null baseline 1.82s real, 1243414528 max RSS optimized 3.04s real, 554549248 max RSS about 55% lower peak RSS, but wall time is slower on this tiny subset
Real local history: daily --offline --since 20260327 --until 20260327 effectively hung / very slow in earlier reproduction 9.75s real, 695648256 max RSS command now completes reliably with much lower memory
Synthetic large-file prototype comparison current package: 5.02s, 5.42 GB peak optimized prototype: 1.15s, 152 MB peak about 4.4x faster, about 36x less memory
Narrow date-range scan potential on this dataset full scan: about 46G one-day directory: about 224 MB about 220x less data to read when filters are narrow

Root cause

The previous implementation loaded ~/.codex/sessions/**/*.jsonl with readFile, split the whole file into lines, validated/parsing each line in the hot path, pushed every event into arrays, and only then applied reporting filters and aggregation.

It also ignored session_meta.payload.forked_from_id, so when a child session replayed ancestor token_count history the loader treated that replay as fresh usage and double-counted it.

What changed

  • switched apps/codex/src/data-loader.ts from readFile + split + per-line schema parsing to streaming line-by-line parsing
  • added file-level date overlap filtering before reading session files
  • added callback-based loading so commands can aggregate online without materializing all events
  • updated daily, monthly, and session commands to build summaries directly during load instead of collecting and sorting every event first
  • kept the existing report builders, but refactored them so they can also finalize from pre-aggregated summaries
  • deduplicated replayed token_count prefixes in forked sessions by following forked_from_id ancestry during load
  • added coverage for streaming callback mode, date-range file skipping, and forked-session replay deduplication

Validation

Checks run:

  • pnpm --filter @ccusage/codex test -- src/data-loader.ts
  • pnpm --filter @ccusage/codex lint
  • pnpm --filter @ccusage/codex typecheck
  • pnpm --filter @ccusage/codex build

CLI validation:

  • baseline benchmark in a separate worktree: cd /Users/jack/tmp/ccusage_repo-baseline/apps/codex && CODEX_HOME=/Users/jack/tmp/codex-home-one-day /usr/bin/time -l node dist/index.js daily --offline --since 20260327 --until 20260327 >/dev/null
  • node dist/index.js daily --offline --since 20260327 --until 20260327 on the real local Codex history completed in 9.75s real with 695648256 max RSS
  • CODEX_HOME=/Users/jack/tmp/codex-home-one-day node dist/index.js daily --offline --since 20260327 --until 20260327 completed in 3.04s real with 554549248 max RSS
  • verified session output on the real local history for 2026-03-27
  • verified monthly output on a one-day Codex subset home
  • added regression coverage that deduplicates replayed fork prefixes and still works when the ancestor file falls outside the requested date range

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 27, 2026

📝 Walkthrough

Walkthrough

Replaces bulk JSONL event loading with streaming, on-the-fly accumulation into summary maps and separates aggregation (accumulators) from row construction across daily, monthly, and session reports; commands now stream events via callbacks and report builders consume summaries.

Changes

Cohort / File(s) Summary
Data Loading Streaming
apps/codex/src/data-loader.ts
Switched from full-file JSONL reads to streaming line-by-line parsing (createReadStream + readline). Added date-range file skipping, per-event filtering, fork/session deduplication, and LoadOptions: onEvent, collectEvents, sortEvents.
Daily Report Refactor
apps/codex/src/daily-report.ts
Added DailySummaries type, accumulateDailyUsage() accumulator, and buildDailyReportRows(); buildDailyReport() now filters by range, accumulates into summaries, and delegates row creation.
Monthly Report Refactor
apps/codex/src/monthly-report.ts
Added MonthlySummaries type, accumulateMonthlyUsage() accumulator, and buildMonthlyReportRows(); buildMonthlyReport() now delegates accumulation and row building.
Session Report Refactor
apps/codex/src/session-report.ts
Added SessionSummaries type, accumulateSessionUsage() accumulator, and buildSessionReportRows(); buildSessionReport() refactored to use new helpers and date filtering.
Command Integration
apps/codex/src/commands/daily.ts, apps/codex/src/commands/monthly.ts, apps/codex/src/commands/session.ts
Commands now call loadTokenUsageEvents with onEvent and collectEvents: false to build summaries incrementally. Empty-data checks changed from events.length === 0 to summaries.size === 0. Report row builders now accept summaries.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(200,220,255,0.5)
    participant CLI as CLI Command
  end
  rect rgba(200,255,200,0.5)
    participant Loader as Data Loader (stream)
  end
  rect rgba(255,235,200,0.5)
    participant Acc as Accumulator (summaries)
  end
  rect rgba(255,200,220,0.5)
    participant Report as Report Rows Builder
  end
  rect rgba(240,240,240,0.5)
    participant Pricing as PricingSource
  end

  CLI->>Loader: request events (since, until, timezone, onEvent)
  Loader->>Loader: stream parse JSONL lines
  Loader->>Acc: onEvent(event)
  Acc-->>Loader: acknowledge
  CLI->>Acc: after load -> check summaries.size
  CLI->>Report: build...Rows(summaries, options)
  Report->>Pricing: fetch pricing for unique models
  Pricing-->>Report: pricing data
  Report-->>CLI: rows (formatted report)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ryoppippi

Poem

🐰
I hopped through lines both long and thin,
Streaming tokens in a tidy spin.
Summaries grew as I nibbled each row,
Reports now sprout where once logs would flow.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: optimizing session loading and fixing forked session double-counting issues, which aligns with the PR's core objectives and implementation changes across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jackcpku jackcpku marked this pull request as ready for review March 27, 2026 03:47
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
apps/codex/src/data-loader.ts (1)

241-245: Use Result.try() for JSON parsing.

The file already imports Result from @praha/byethrow. Replace the try-catch with Result.try() to match the codebase convention for handling throw-prone operations:

const parsed = Result.try(() => JSON.parse(trimmed));
if (Result.isFailure(parsed)) {
  return undefined;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/codex/src/data-loader.ts` around lines 241 - 245, Replace the manual
try/catch around JSON.parse with the Result.try pattern: call Result.try(() =>
JSON.parse(trimmed)), check Result.isFailure on the returned result and return
undefined on failure, and otherwise replace the previous parsed variable with
the successful result's value (e.g., parsed = result.value) so downstream code
still uses parsed as the parsed object; use the symbols Result.try,
Result.isFailure, JSON.parse, trimmed, and parsed to locate and update the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/codex/src/data-loader.ts`:
- Around line 280-295: The bug is that when a payload provides only
last_token_usage, state.previousTotals isn't updated, so later
falls-back-to-total_token_usage subtraction (subtractRawUsage) uses an outdated
baseline and double-counts; to fix, after computing raw from lastUsage (when
lastUsage != null) update state.previousTotals to reflect the same baseline used
(set state.previousTotals = totalUsage if totalUsage exists or to a value
derived from lastUsage/last_token_usage equivalent), ensuring functions
normalizeRawUsage, subtractRawUsage, and the variables lastUsage, totalUsage,
raw, and state.previousTotals remain consistent so subsequent total-token deltas
subtract from the correct baseline.
- Around line 444-469: The current try/catch around the for-await loop swallows
exceptions thrown by options.onEvent; fix this by separating streaming/parsing
errors from accumulator errors: inside the loop (function parseTokenUsageEvent,
isEventWithinRange, lines, events, sessionId, state), only parse and push events
while handling streaming I/O in the try/catch, and remove/avoid awaiting
options.onEvent there; after the loop finishes (or in a separate step), iterate
the collected events and call await options.onEvent(event) so any error from
options.onEvent bubbles up instead of being caught as a file-read error; ensure
lines.close() still runs in finally.

---

Nitpick comments:
In `@apps/codex/src/data-loader.ts`:
- Around line 241-245: Replace the manual try/catch around JSON.parse with the
Result.try pattern: call Result.try(() => JSON.parse(trimmed)), check
Result.isFailure on the returned result and return undefined on failure, and
otherwise replace the previous parsed variable with the successful result's
value (e.g., parsed = result.value) so downstream code still uses parsed as the
parsed object; use the symbols Result.try, Result.isFailure, JSON.parse,
trimmed, and parsed to locate and update the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e167500-791c-47a8-8836-bea0ddf2ded9

📥 Commits

Reviewing files that changed from the base of the PR and between fcb5ab6 and ac81dac.

📒 Files selected for processing (7)
  • apps/codex/src/commands/daily.ts
  • apps/codex/src/commands/monthly.ts
  • apps/codex/src/commands/session.ts
  • apps/codex/src/daily-report.ts
  • apps/codex/src/data-loader.ts
  • apps/codex/src/monthly-report.ts
  • apps/codex/src/session-report.ts

@jackcpku jackcpku changed the title [codex] optimize codex session loading [codex] fix forked session double-counting and optimize loading Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant