Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions .github/workflows/bench.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
name: Bench Gate

# Compares hot-path benchmarks between the PR head and its merge
# base, so performance regressions in the sync engine and DB write
# paths fail the PR instead of shipping. The gated regression classes
# have all shipped before:
#
# - discovery/skip work scaling with archive size instead of new
# data (#912, the providerSourceUnchangedInDB gap)
# - O(session history) work per incremental append (#954)
# - bulk ingest throughput (#411)
# - per-row query-shape regressions in usage aggregation (#309)
#
# Both sides run `make bench-gate` — the Makefile is the single
# source of truth for the gated package list, sample count, and
# iteration count — and cmd/benchgate compares the outputs. It gates
# on allocs/op and B/op (deterministic on a given machine, tight
# thresholds) and on ns/op with a loose 2x threshold that only
# catches algorithmic blowups; both sides run on the same runner
# within one job, so the comparison is apples to apples.
#
# Benchmarks that only exist on one side are reported but never fail
# the gate: a benchmark added by a PR has no baseline and is reported
# without gating, then gates automatically once merged. Because each
# side benchmarks its own Makefile's package list, a PR that adds a
# package to the gate cannot break the base run. A partially failing
# base run degrades to a partial baseline (whatever benchmarks it
# produced still gate) rather than silently disabling the whole gate.

on:
pull_request:
# Docs/frontend-only PRs cannot change the gated Go paths. If
# this check is ever made required on branch protection, pair it
# with a no-op sibling workflow on the inverse paths.
paths:
- "**.go"
- "go.mod"
- "go.sum"
- "Makefile"
- ".github/workflows/bench.yml"

concurrency:
group: bench-${{ github.head_ref || github.ref }}
cancel-in-progress: true

permissions:
contents: read

jobs:
bench-gate:
name: Benchmark Gate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: false
fetch-depth: 0

- uses: actions/setup-go@4a3601121dd01d1626a1e23e37211e3254c1c06c # v6.4.0
with:
go-version-file: go.mod

- name: Run benchmarks (PR head)
run: |
set -euo pipefail
make -s bench-gate | tee /tmp/bench-new.txt

- name: Run benchmarks (merge base)
env:
BASE_REF: ${{ github.base_ref }}
# A failing merge-base run keeps whatever benchmark output it
# produced: go test emits results per package, so one broken
# package (or a base predating the bench-gate target) leaves
# a partial or empty baseline and benchgate gates only what
# exists on both sides. The warning makes the degraded run
# visible instead of a silently green vacuous pass.
#
# The sample count and fixed iteration count are evaluated
# from the PR head's Makefile and passed into the base run:
# two benchmarks grow their fixture per iteration, so a PR
# that changes BENCH_GATE_COUNT/TIME must not compare against
# a baseline measured with the old values. The package list
# intentionally stays per-side, so growing the gate cannot
# break the base run.
run: |
set -euo pipefail
eval "$(make -s bench-gate-config)"
base=$(git merge-base HEAD "origin/$BASE_REF")
git worktree add /tmp/bench-base "$base"
if ! make -s -C /tmp/bench-base bench-gate \
BENCH_GATE_COUNT="$BENCH_GATE_COUNT" \
BENCH_GATE_TIME="$BENCH_GATE_TIME" > /tmp/bench-old.txt; then
echo "::warning title=Bench Gate::merge-base benchmark run exited non-zero; gating against its partial output"
fi

- name: Compare against merge base
run: go run ./cmd/benchgate -old /tmp/bench-old.txt -new /tmp/bench-new.txt
32 changes: 31 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ AIR_BIN := $(shell if command -v air >/dev/null 2>&1; then command -v air; \
elif [ -x "$(GOPATH_FIRST)/bin/air" ]; then printf "%s" "$(GOPATH_FIRST)/bin/air"; \
fi)

.PHONY: build build-release install frontend frontend-dev dev check-air air-install desktop-dev desktop-build desktop-macos-app desktop-macos-dmg desktop-windows-installer desktop-linux-appimage desktop-app docs-install docs-build docs-serve docs-check docs-screenshots docs-assets-branch docs-generated-assets-branch docs-deploy-staging docs-deploy test test-short bench-backends test-postgres test-postgres-ci test-s3 postgres-up postgres-down test-ssh test-ssh-ci ssh-up ssh-down e2e e2e-duckdb vet lint lint-ci lint-golangci lint-golangci-ci nilaway nilaway-golangci-build lint-tools tidy clean release release-darwin-arm64 release-darwin-amd64 release-linux-amd64 install-hooks ensure-embed-dir pricing-snapshot dev-snapshot help
.PHONY: build build-release install frontend frontend-dev dev check-air air-install desktop-dev desktop-build desktop-macos-app desktop-macos-dmg desktop-windows-installer desktop-linux-appimage desktop-app docs-install docs-build docs-serve docs-check docs-screenshots docs-assets-branch docs-generated-assets-branch docs-deploy-staging docs-deploy test test-short bench-backends bench-gate bench-gate-config test-postgres test-postgres-ci test-s3 postgres-up postgres-down test-ssh test-ssh-ci ssh-up ssh-down e2e e2e-duckdb vet lint lint-ci lint-golangci lint-golangci-ci nilaway nilaway-golangci-build lint-tools tidy clean release release-darwin-arm64 release-darwin-amd64 release-linux-amd64 install-hooks ensure-embed-dir pricing-snapshot dev-snapshot help

# Ensure go:embed has at least one file (no-op if frontend is built)
ensure-embed-dir:
Expand Down Expand Up @@ -266,6 +266,35 @@ bench-backends: pricing-snapshot ensure-embed-dir
AGENTSVIEW_BENCH_MESSAGES_PER_SESSION=$(BENCH_BACKENDS_MESSAGES_PER_SESSION) \
CGO_ENABLED=1 go test -tags "fts5,benchdb" ./internal/backendbench $(BENCH_BACKENDS_FLAGS)

# Hot-path benchmark gate. Runs every benchmark in the gated packages
# (sync engine warm/cold/append, message write paths, usage
# aggregation, secret scanning). This target is the single source of
# truth for the gate configuration: CI's bench.yml runs it on both
# the PR head and the merge base, then compares the outputs with
# `go run ./cmd/benchgate -old old.txt -new new.txt`. Run it before
# and after touching a sync or DB hot path.
BENCH_GATE_PACKAGES ?= ./internal/sync ./internal/db ./internal/secrets
# Count must stay >= 5: benchgate's time gate needs at least 5
# candidate samples for its significance test.
BENCH_GATE_COUNT ?= 6
# Fixed iterations, not a duration: some gated benchmarks grow their
# fixture as they iterate, so baseline and candidate must run the
# same iteration count to measure identical workloads.
BENCH_GATE_TIME ?= 20x
bench-gate: pricing-snapshot ensure-embed-dir
CGO_ENABLED=1 go test -tags "fts5" -run '^$$' \
-bench . -benchmem \
-count $(BENCH_GATE_COUNT) -benchtime $(BENCH_GATE_TIME) \
-timeout 25m $(BENCH_GATE_PACKAGES)

# Prints the gate's sample/iteration configuration in shell-evalable
# form. CI evaluates this on the PR head and passes the values into
# the merge-base `make bench-gate` invocation, so both sides measure
# identical workloads even when a PR changes the defaults above (the
# package list intentionally stays per-side).
bench-gate-config:
@echo "BENCH_GATE_COUNT=$(BENCH_GATE_COUNT) BENCH_GATE_TIME=$(BENCH_GATE_TIME)"

# Start test PostgreSQL container
postgres-up:
docker compose -f docker-compose.test.yml up -d --wait
Expand Down Expand Up @@ -484,6 +513,7 @@ help:
@echo " test - Run all tests"
@echo " test-short - Run fast tests only"
@echo " bench-backends - Benchmark SQLite, DuckDB, and PostgreSQL stores"
@echo " bench-gate - Run the hot-path benchmarks CI gates PRs on"
@echo " test-postgres - Run PostgreSQL integration tests"
@echo " test-s3 - Run S3 discovery integration tests (Docker)"
@echo " postgres-up - Start test PostgreSQL container"
Expand Down
Loading
Loading