feat: console observability + real batch processing#36
Open
been-there-done-that wants to merge 21 commits into
Open
feat: console observability + real batch processing#36been-there-done-that wants to merge 21 commits into
been-there-done-that wants to merge 21 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove chromium/LO status fields from TickerPayload constructor - Add placeholder p50_ms/p55_ms (0.0) to TickerPayload constructor - Add conversions_total/error_rate/bytes_mb/idle_secs placeholders to both EnginePayload constructors - Add queue_wait_p95_ms/queue_processing placeholders to ConcurrencyPayload - Add ts_series/chromium_conv_series/libreoffice_conv_series/queue_wait_p95_series placeholders to ThroughputPayload Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rectly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n sampler Add global_histogram_pct, engine_conv_total, engine_bytes_total helpers. Sampler now populates all MetricsSample fields with real values. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extract ts/conv/queue_wait series from history ring buffer - TickerPayload now uses real p50_ms/p55_ms from history - EnginePayload reads conv stats, error rate, bytes, idle time from Prometheus - ConcurrencyPayload reads queue_wait_p95_ms and queue_processing live - ThroughputPayload carries all new time series - Add idle_secs() to PdfBackend trait (default 0) + ChromiumBackend impl Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er data Replaces placeholder vec![] with live batch data: progress_pct, elapsed, status badges, item counts, output mode. Sorted running-first, capped at 10. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add p50_ms/p55_ms to TickerPayload, conv/error/bytes/idle to EnginePayload, queue stats to ConcurrencyPayload, full time-series fields to ThroughputPayload, item counts and output_mode to BatchPayload. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tyStrip removed - Ticker: add P50/P55 KPIs, drop Chromium/LibreOffice status blocks - RoutesTable: sticky header, scrollable body, flex layout for height-fill - StackedBarSeries: new SVG component with hover tooltip for dual engine series - EngineConvChart: Chromium + LibreOffice stacked conv/sec chart - QueueWaitChart: queue wait p95 bar chart with tone thresholds - Engines: add conv/err%/bytes/idle sub-stat grid per engine - Concurrency: add queue wait p95 and processing job count row - Batches: richer row layout — mode badge, item counts, failed items highlight - +page.svelte: ThroughputStrip + 2×2 chart grid above routes; ActivityStrip removed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dockerfile.dev / docker-compose.yml
- Fix bench/Cargo.toml missing from image; add stub ui/build for rust-embed
- Seed cargo-target and cargo-registry volumes with pdfbro ownership
- Mount ./bench and ./ui/build in both dev services
banner.rs / main.rs
- Add EngineStatus enum (Ready | Lazy | Unavailable | Disabled)
- Await eager engine starts in parallel before printing banner so status is accurate
- Lazy engines show [--] lazy; disabled engines omit the row entirely
batch_worker.rs / batch.rs / batch_state.rs
- Replace placeholder process_single_item stub with real engine calls
(chromium url/html/markdown/screenshot, libreoffice)
- Persist uploaded files to {storage}/inputs/{batch_id}/ before spawning worker
- Build real ZIP output with zip crate; real merged PDF with engine::merge
- Record Prometheus conversion metrics per item (engine conv chart now fills)
- Use global semaphore (state.sem) instead of per-batch semaphore so
concurrency_active spikes correctly during batch load
backend.rs
- Add is_alive() sync method to PdfBackend trait backed by atomic is_running()
so the console sampler never blocks on the engine mutex during heavy load
console_store.rs
- Replace single error_pct with server_error_pct (5xx only) and
rate_limit_pct (429 only); 4xx client errors excluded from both
- Add prev_rate_limit_total for independent delta tracking
- Sampler uses is_alive() instead of healthy().await — eliminates sampler
stall when Chromium is busy rendering batch items
ui/src/lib/types.ts + Ticker.svelte
- TickerPayload: error_pct → server_error_pct + rate_limit_pct
- Ticker: ERRORS block → 5XX block + 429 block with independent thresholds
Batches.svelte
- Header shows "N queued · N running · N done" summary counts
- Scrollable list (max-height 240px) shows only active jobs
- Completed jobs drop off the list; done count increments in header
scripts/load_test.sh
- New load test script: 5 parallel batches + 10 concurrent URL renders per wave
plus deliberate 4xx calls to verify they don't pollute error panels
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Memory: re-read cgroup memory.current each sampler tick instead of using the stale startup snapshot (fixes 637MB Docker vs 0.18GB UI gap) - CPU: prefer cgroup v2 cpu.stat delta for container-accurate %; fall back to sysinfo only when cgroup v2 is unavailable - Concurrency: clamp pill pct to 100%, use BUSY/WARN/OK labels, remove misleading ERR when active > max (it is load, not failure) - Batch worker: record queue wait time into pdfbro_queue_wait_seconds histogram and track queue_processing gauge per item Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…in header - Remove EngineConvChart and QueueWaitChart (queue wait p95 already shown as a number in the Concurrency panel) - New CpuChart.svelte and MemChart.svelte replace them in Row 2 of the left column, each as a standalone bar-series card - Add live conv/sec badge to each engine's header in Engines.svelte, fed from the last value of throughput chromium/libreoffice conv series - Remove Resources card from right rail; Batches accepts flex:1 style to fill the freed vertical space Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Install bun in Dockerfile.dev dev-base stage - On container start: bun install + initial vite build (synchronous, so the server always starts with the latest UI) - vite build --watch runs in background — UI auto-rebuilds on every save to ui/src without restarting the Rust server - Mount full ./ui into the container instead of just ./ui/build; ui-node-modules named volume keeps Linux binaries off the macOS host - Add build:watch script to ui/package.json Batches panel: - Remove fixed 240px scroll cap; panel now grows to fill available space - Empty state: clipboard SVG + "Queue is empty" message with hint - Routes grid gets align-items:start so columns don't stretch to equal height Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rename engine stat 'conv' → 'total conv' (lifetime count, not rate) - Rename header badge 'c/s' → 'conv/s' (rate, unambiguous) - Add 'time before slot acquired' sub-label under wait p95 so the metric's meaning is self-evident without documentation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d POST Key each route on "METHOD route" to preserve the method extracted from the pdfbro_http_requests_total and pdfbro_http_request_duration_seconds Prometheus labels. /health, /version, /favicon.ico etc. now show GET. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g restarts Without -w flags cargo-watch watches all of /app, so every file vite writes to ui/build/ during build:watch triggers a Rust recompile. Scope it to crates/, Cargo.toml, and Cargo.lock only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Layout: - Outer page div is now a flex column so the main grid can flex:1 into the remaining viewport height - Grid uses align-items:stretch so both columns grow to the same height - Routes (left) and Batches (right) both fill to the bottom CPU %: - When no cgroup CPU limit: show delta_usec/5s * 100 = % of 1 core, same formula as docker stats (was dividing by num_host_cpus → ~1%) - When a CPU quota is set: normalise by limit_cores → % of container quota - Update CPU chart sub-label to '% of 1 core · cgroup cpu.stat' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds FAST=1 build arg to Dockerfile.test that runs `cargo test --lib` (unit tests only, no BDD/integration, no Chrome/LO required, ~60s). Also fixes grep pattern: reverts overly-broad ^error: addition that was incorrectly catching the known LibreOffice atexit teardown noise as a test failure. Only compiler errors (^error\[) and libtest verdict lines are checked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
5XX(server errors) and429(rate limits) — 4xx client errors excluded from bothlazy/ready/unavailableaccuratelyis_alive()(atomic) instead ofhealthy().await(engine mutex) — prevents 5s stall during heavy batch loadTest plan
make devstarts cleanly with both engines showing[OK] readyin banner./scripts/load_test.shruns 3 waves; Engine Conversions chart fills, Concurrency panel spikes, Batches panel scrolls and counts correctly5XXor429ticker blocksdonecount in header increments🤖 Generated with Claude Code