Skip to content

perf(job): two-pass listing and fast queue stats#230

Merged
retr0h merged 3 commits intomainfrom
perf/two-pass-job-listing
Mar 7, 2026
Merged

perf(job): two-pass listing and fast queue stats#230
retr0h merged 3 commits intomainfrom
perf/two-pass-job-listing

Conversation

@retr0h
Copy link
Collaborator

@retr0h retr0h commented Mar 7, 2026

Summary

  • Two-pass job listing: Pass 1 calls kv.Keys() once (1 read regardless of queue size) and derives job status from key name patterns in memory — no kv.Get() calls. Pass 2 fetches full details for only the paginated page (limit=10 = ~10 reads). Cost is proportional to page size, not queue size.
  • Remove GetQueueStats: Replaced with fast GetQueueSummary (key-name-only). Status counts are now included in the ListJobs response (free from Pass 1), so client job list makes a single API call instead of two. Removes OperationCounts which required reading every job payload.
  • Pagination limits: Enforce max page size of 100 (limit=0 and limit>100 return 400). Add input validation for agent hostname params.
  • TUI fix: Use alt screen for client job status to prevent debug log corruption during refresh.

Test plan

  • go build ./... compiles
  • just go::unit — all tests pass
  • just go::vet — lint clean
  • go run main.go client job list — fast (~1.4s with 500 jobs, was 68s+)
  • go run main.go client job list --status failed — fast filtered listing
  • go run main.go client job status — TUI renders cleanly with alt screen
  • go run main.go client job status --json — returns stats in ~1.4s

🤖 Generated with Claude Code

retr0h and others added 2 commits March 7, 2026 10:50
Replace per-job kv.Get() calls during filtered listing with a key-name-
only status derivation pass, then fetch full details only for the
paginated page. Extract computeStatusFromKeyNames pure function shared
by both ListJobs and GetQueueSummary. Enforce max page size (1-100) at
the OpenAPI, handler, and client levels.

Also add hostname validation to agent domain handlers (get, drain,
undrain) matching the node domain pattern, restore defense-in-depth
validation in the file upload handler, and add corresponding tests.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace the slow GetQueueStats (which read every job payload) with
GetQueueSummary (key-name-only derivation) for the job status endpoint.
Move status counts into the ListJobs response so the CLI makes a single
API call instead of two. Remove OperationCounts entirely as it required
reading every job payload. Fix job status TUI rendering with alt screen.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

Thank you for contributing to this project! 😊🕹️

@codecov
Copy link

codecov bot commented Mar 7, 2026

Codecov Report

❌ Patch coverage is 97.03704% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/api/file/file_upload.go 0.00% 1 Missing and 1 partial ⚠️
internal/api/job/job_list.go 71.42% 1 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (97.03%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #230      +/-   ##
==========================================
- Coverage   99.98%   99.92%   -0.07%     
==========================================
  Files         182      183       +1     
  Lines        6537     6523      -14     
==========================================
- Hits         6536     6518      -18     
- Misses          1        3       +2     
- Partials        0        2       +2     
Files with missing lines Coverage Δ
internal/api/agent/agent_drain.go 100.00% <100.00%> (ø)
internal/api/agent/agent_get.go 100.00% <100.00%> (ø)
internal/api/agent/agent_undrain.go 100.00% <100.00%> (ø)
internal/api/agent/validate.go 100.00% <100.00%> (ø)
internal/api/job/job_status.go 100.00% <100.00%> (ø)
internal/job/client/jobs.go 100.00% <100.00%> (ø)
internal/api/file/file_upload.go 97.56% <0.00%> (-2.44%) ⬇️
internal/api/job/job_list.go 95.83% <71.42%> (-4.17%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update db7def9...bbbc3a3. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add architecture documentation for the two-pass job listing approach,
pagination limits, and the known scalability constraint around
kv.Keys() returning all keys. Fix stale GetQueueStats references.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@retr0h
Copy link
Collaborator Author

retr0h commented Mar 7, 2026

Intentionally letting coverage drop

@retr0h retr0h merged commit d9a9135 into main Mar 7, 2026
7 of 9 checks passed
@retr0h retr0h deleted the perf/two-pass-job-listing branch March 7, 2026 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant