From a149a0d5905e66a274612a57115ffa0a9761d763 Mon Sep 17 00:00:00 2001 From: Rusty Conover Date: Tue, 23 Jun 2026 20:12:19 -0400 Subject: [PATCH 1/2] ci: add multi-transport (subprocess/http/unix) SQL E2E matrix The haybarn integration suite (test/sql/*.test) previously ran only over the subprocess/stdio transport. Add HTTP and AF_UNIX (launcher) coverage so the same suite exercises every transport the vgi extension supports, by changing only what the .test files ATTACH as the worker LOCATION. - cmd/vgi-cve-worker: wire a --unix flag alongside --http; it serves the SDK's AF_UNIX launcher transport (RunUnix prints "UNIX:") with the idle timeout disabled (CI owns the process lifecycle). - ci/run-integration.sh: parameterize by TRANSPORT (subprocess|http|unix). http starts `--http` and parses the SDK's "PORT:" line -> http://host:port (bare root: the extension POSTs methods at /, mounted at the server root; a /vgi path would 404). unix starts `--unix `, waits for the "UNIX:" line and the socket file -> unix://. The mock NVD server now runs for ALL transports (the worker's table functions still call it); every started process is trap-killed on exit. Keep the INSTALL vgi FROM community warm step and fixture staging unchanged. - Guard against the runner's silent network-error skip: the DuckDB/Haybarn sqllogictest runner SKIPS (exit 0) any test whose error matches "HTTP", so a broken HTTP leg would fake-pass having run nothing. The script now fails the leg when every test was skipped, surfacing the skip reason. - .github/workflows/ci.yml: turn the integration job into a transport matrix (subprocess, http, unix); the build/vet/fmt/unit job is untouched. - ci/README.md: document the matrix, port/socket discovery, the always-on mock, and the silent-skip guard. Local validation (haybarn-unittest v1.5.4-rc1, osx_arm64): subprocess GREEN (25 assertions), unix GREEN (25 assertions). http is SKIPPED locally by the runner's network-error rule (the guard fails the leg loudly rather than faking a pass); the worker + community vgi extension ATTACH and query correctly over HTTP when exercised directly, so this is validated on linux CI. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/ci.yml | 19 ++++- ci/README.md | 39 +++++++++ ci/run-integration.sh | 168 ++++++++++++++++++++++++++++++++----- cmd/vgi-cve-worker/main.go | 18 +++- 4 files changed, 220 insertions(+), 24 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index faea3fe..5e7cb82 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -53,10 +53,23 @@ jobs: # SQL end-to-end: run the sqllogictest suite (test/sql/*.test) against the # built Go worker through the real signed `vgi` community DuckDB extension via # a prebuilt standalone `haybarn-unittest` — no C++ build. See ci/README.md. + # + # Transport matrix: the same suite runs over each transport the vgi extension + # supports, selected by ci/run-integration.sh's TRANSPORT env var (which + # changes what the .test files ATTACH as the worker LOCATION): + # subprocess worker spawned over stdio (the binary path) + # http worker started with --http, LOCATION = http://127.0.0.1: + # unix worker started with --unix , LOCATION = unix:// + # The mock NVD server is started for ALL transports (the worker's table + # functions still call it). See ci/README.md for the per-transport notes. integration: - name: SQL end-to-end (haybarn) + name: SQL E2E (${{ matrix.transport }}) needs: resolve-haybarn runs-on: ubuntu-latest + strategy: + fail-fast: false + matrix: + transport: [subprocess, http, unix] steps: - uses: actions/checkout@v4 @@ -88,5 +101,7 @@ jobs: echo "HAYBARN_UNITTEST=$UNITTEST" >> "$GITHUB_ENV" echo "VGI_CVE_WORKER=$PWD/vgi-cve-worker" >> "$GITHUB_ENV" - - name: Run extension integration suite + - name: Run extension integration suite (${{ matrix.transport }}) run: ci/run-integration.sh + env: + TRANSPORT: ${{ matrix.transport }} diff --git a/ci/README.md b/ci/README.md index 3bf5ce8..3ef19cd 100644 --- a/ci/README.md +++ b/ci/README.md @@ -5,6 +5,45 @@ tests and this repo's sqllogictest suite (`test/sql/*.test`) against the vgi-cve VGI worker through the **real DuckDB `vgi` extension** on every push / PR. +## Transport matrix + +The same SQL suite runs over **every transport the vgi extension supports**, as +a GitHub Actions matrix (`SQL E2E (subprocess)`, `SQL E2E (http)`, +`SQL E2E (unix)`). The transport is selected by the `TRANSPORT` env var passed +to [`run-integration.sh`](run-integration.sh), which only changes what the +`.test` files ATTACH as the worker `LOCATION` (the vgi extension picks the +transport from that string): + +| `TRANSPORT` | Worker launch | `VGI_CVE_WORKER` (ATTACH LOCATION) | +| ------------ | ------------------------------------------ | -------------------------------------- | +| `subprocess` | extension spawns the binary over stdio | `/abs/path/to/vgi-cve-worker` | +| `http` | `vgi-cve-worker --http` (prints `PORT:`) | `http://127.0.0.1:` | +| `unix` | `vgi-cve-worker --unix ` (prints `UNIX:`) | `unix://` | + +Port/socket discovery: for **http** the script parses the `PORT:` line the +SDK prints on stdout (`vgi/worker.go` `RunHttp`); for **unix** it waits for the +`UNIX:` line *and* for the socket file to exist before running the suite. +The HTTP `LOCATION` is the **bare** `scheme://host:port` with no path — the +extension POSTs each RPC method at `/` (e.g. +`/catalog_attach`), and the Go SDK mounts those at the server root; appending a +`/vgi` path would 404 every method. + +**The mock NVD server runs for every transport.** The cve worker's table +functions (`cve`, `cve_search`, `cpe_cves`) still make a live NVD 2.0 HTTP call +regardless of how DuckDB talks to the worker, so the script always builds and +starts `mockserver`, exports `VGI_CVE_TEST_URL`, and trap-kills it (plus any +out-of-band worker process) on exit. + +### Silent-skip guard (no fake passes) + +The DuckDB/Haybarn sqllogictest runner **skips** (exit 0, not a failure) any +test whose error message matches a built-in network-error allowlist that +includes the substring `HTTP`. A broken HTTP transport would therefore report +"All tests were skipped" and the job would go *green having run nothing*. +`run-integration.sh` guards against this: it captures the runner output and +**fails the leg** if every test was skipped, surfacing the runner's skip +reason. A real run must print `All tests passed (N assertions ...)`. + ## How it works (no C++ build) Rather than building the vgi DuckDB extension from source, CI drives a diff --git a/ci/run-integration.sh b/ci/run-integration.sh index c545da5..de1d75c 100755 --- a/ci/run-integration.sh +++ b/ci/run-integration.sh @@ -5,31 +5,76 @@ # VGI worker, using a prebuilt standalone `haybarn-unittest` and the signed # community `vgi` extension — no C++ build from source. See ci/README.md. # -# The cve worker's table functions hit the NVD 2.0 API, so the suite needs a -# mock NVD server: this script builds the repo's `mockserver`, starts it on a -# free port, and points the tests at it via VGI_CVE_TEST_URL (mirroring -# `make test-sql`). The offline CVSS scalars need no server. +# Multi-transport: the same suite runs over whichever transport the +# TRANSPORT env var selects, by changing what `VGI_CVE_WORKER` resolves to +# (the vgi extension picks the transport from the ATTACH LOCATION string): +# +# subprocess (default) VGI_CVE_WORKER = the stdio worker binary +# -> extension spawns it over stdin/stdout. +# http start ` --http` (prints "PORT:"), parse the +# port, VGI_CVE_WORKER = http://127.0.0.1:. +# (The extension POSTs each RPC method at /, +# e.g. /catalog_attach; the SDK mounts them at the root.) +# unix start ` --unix /tmp/cve.sock` (prints +# "UNIX:"), VGI_CVE_WORKER = unix:///tmp/cve.sock. +# +# In every transport the cve worker's table functions still hit the NVD 2.0 +# API, so the suite ALWAYS needs the mock NVD server: this script builds the +# repo's `mockserver`, starts it on a free port, and points the tests at it via +# VGI_CVE_TEST_URL (mirroring `make test-sql`). The offline CVSS scalars need no +# server. All started processes are trap-killed on exit. # # Required environment: # HAYBARN_UNITTEST path to the haybarn-unittest binary -# VGI_CVE_WORKER worker LOCATION the .test files ATTACH (the built Go -# worker binary the vgi extension spawns over stdio) +# VGI_CVE_WORKER for TRANSPORT=subprocess: the worker LOCATION the .test +# files ATTACH (the built Go worker binary, spawned over +# stdio). For http/unix this is OVERRIDDEN by this script, +# but the binary it points at is reused to launch the +# out-of-band server, so it must still be the worker path. # Optional: +# TRANSPORT subprocess (default) | http | unix # STAGE scratch dir for the preprocessed test tree (default: mktemp) set -euo pipefail : "${HAYBARN_UNITTEST:?path to the haybarn-unittest binary}" : "${VGI_CVE_WORKER:?worker LOCATION (the built Go worker binary)}" +TRANSPORT="${TRANSPORT:-subprocess}" +case "$TRANSPORT" in + subprocess|http|unix) ;; + *) echo "ERROR: unknown TRANSPORT='$TRANSPORT' (expected subprocess|http|unix)" >&2; exit 2 ;; +esac + HERE="$(cd "$(dirname "$0")" && pwd)" REPO="$(cd "$HERE/.." && pwd)" STAGE="${STAGE:-$(mktemp -d)}" -# --- Start the mock NVD server (the .test files fetch from it) -------------- +# The worker binary the subprocess transport ATTACHes to is also the binary we +# launch out-of-band for http/unix. Capture it before we possibly overwrite +# VGI_CVE_WORKER with a URL. +WORKER_BIN="$VGI_CVE_WORKER" + +# Collected PIDs and paths to clean up on exit (mock + optional worker server). +MOCK_PID="" +WORKER_PID="" +UNIX_SOCK="" +cleanup() { + # Preserve the script's exit status: this runs on EXIT, so its own last + # command must not clobber the real exit code (a bare `[ -n "$x" ]` that is + # false returns 1 and would turn a green run red). + local rc=$? + if [ -n "$WORKER_PID" ]; then kill "$WORKER_PID" 2>/dev/null || true; wait "$WORKER_PID" 2>/dev/null || true; fi + if [ -n "$MOCK_PID" ]; then kill "$MOCK_PID" 2>/dev/null || true; wait "$MOCK_PID" 2>/dev/null || true; fi + if [ -n "$UNIX_SOCK" ]; then rm -f "$UNIX_SOCK"; fi + return "$rc" +} +trap cleanup EXIT + +# --- Start the mock NVD server (the .test files fetch from it; all transports) - # Build + launch the repo's standalone mock server on a free port; it prints # "PORT:" on stdout (see cmd/mockserver/main.go). We capture that, export -# VGI_CVE_TEST_URL (the NVD 2.0 CVE endpoint path the worker expects), and kill -# the server on exit — exactly like `make test-sql`. +# VGI_CVE_TEST_URL (the NVD 2.0 CVE endpoint path the worker expects). The +# mock is required for every transport — the worker still makes the HTTP call. MOCK_BIN="$STAGE/mockserver" echo "Building mock NVD server ..." ( cd "$REPO" && go build -o "$MOCK_BIN" ./cmd/mockserver ) @@ -37,12 +82,6 @@ echo "Building mock NVD server ..." MOCK_PORT_FILE="$(mktemp)" "$MOCK_BIN" --addr 127.0.0.1:0 >"$MOCK_PORT_FILE" 2>/dev/null & MOCK_PID=$! -cleanup() { - kill "$MOCK_PID" 2>/dev/null || true - wait "$MOCK_PID" 2>/dev/null || true - rm -f "$MOCK_PORT_FILE" -} -trap cleanup EXIT PORT="" for _ in $(seq 1 30); do @@ -54,9 +93,73 @@ if [ -z "$PORT" ]; then echo "ERROR: mock server did not report a port" >&2 exit 1 fi +rm -f "$MOCK_PORT_FILE" export VGI_CVE_TEST_URL="http://127.0.0.1:$PORT/rest/json/cves/2.0" echo "Mock NVD server listening on $VGI_CVE_TEST_URL (pid $MOCK_PID)" +# --- Per-transport: resolve VGI_CVE_WORKER (the ATTACH LOCATION) ------------- +# subprocess keeps the binary path (extension spawns stdio). http/unix start the +# worker out-of-band and hand the extension a URL. +case "$TRANSPORT" in + subprocess) + echo "Transport: subprocess/stdio — VGI_CVE_WORKER=$VGI_CVE_WORKER" + ;; + + http) + # Start the worker in --http mode; it prints "PORT:" once listening. + WORKER_PORT_FILE="$(mktemp)" + echo "Transport: http — starting '$WORKER_BIN --http' ..." + "$WORKER_BIN" --http >"$WORKER_PORT_FILE" 2>/dev/null & + WORKER_PID=$! + WPORT="" + for _ in $(seq 1 50); do + WPORT="$(sed -n 's/^PORT:\([0-9][0-9]*\)$/\1/p' "$WORKER_PORT_FILE" 2>/dev/null | head -1)" + [ -n "$WPORT" ] && break + # Bail early if the worker died. + kill -0 "$WORKER_PID" 2>/dev/null || { echo "ERROR: http worker exited before reporting a port" >&2; cat "$WORKER_PORT_FILE" >&2 || true; exit 1; } + sleep 0.2 + done + rm -f "$WORKER_PORT_FILE" + if [ -z "$WPORT" ]; then + echo "ERROR: http worker did not report a port" >&2 + exit 1 + fi + # The extension treats the LOCATION as a base and POSTs each RPC method at + # / (e.g. /catalog_attach). The SDK mounts those methods + # at the server root (empty prefix), so the LOCATION must be the bare + # scheme://host:port with NO path. Appending /vgi would make every method + # 404 — which the runner silently skips as an error "matching 'HTTP'". + export VGI_CVE_WORKER="http://127.0.0.1:$WPORT" + echo "HTTP worker listening on $VGI_CVE_WORKER (pid $WORKER_PID)" + ;; + + unix) + # Start the worker on an AF_UNIX socket; it prints "UNIX:" once + # listening. idleTimeout is disabled (we own the process lifecycle). + UNIX_SOCK="${TMPDIR:-/tmp}/cve.$$.sock" + rm -f "$UNIX_SOCK" + WORKER_OUT_FILE="$(mktemp)" + echo "Transport: unix — starting '$WORKER_BIN --unix $UNIX_SOCK' ..." + "$WORKER_BIN" --unix "$UNIX_SOCK" >"$WORKER_OUT_FILE" 2>/dev/null & + WORKER_PID=$! + READY="" + for _ in $(seq 1 50); do + if grep -q '^UNIX:' "$WORKER_OUT_FILE" 2>/dev/null && [ -S "$UNIX_SOCK" ]; then + READY=1; break + fi + kill -0 "$WORKER_PID" 2>/dev/null || { echo "ERROR: unix worker exited before the socket was ready" >&2; cat "$WORKER_OUT_FILE" >&2 || true; exit 1; } + sleep 0.2 + done + rm -f "$WORKER_OUT_FILE" + if [ -z "$READY" ]; then + echo "ERROR: unix worker did not report a ready socket at $UNIX_SOCK" >&2 + exit 1 + fi + export VGI_CVE_WORKER="unix://$UNIX_SOCK" + echo "Unix worker listening on $VGI_CVE_WORKER (pid $WORKER_PID)" + ;; +esac + # --- Stage the preprocessed tests ------------------------------------------- echo "Staging preprocessed tests into $STAGE ..." mkdir -p "$STAGE/test/sql" @@ -81,7 +184,34 @@ EOF "$HAYBARN_UNITTEST" "test/_warm.test" >/dev/null 2>&1 || echo "::warning::extension warm step did not fully succeed" rm -f "$STAGE/test/_warm.test" -# Run the whole suite in one invocation, streaming the runner's native -# sqllogictest report. Any failed assertion exits non-zero and fails the job. -echo "Running suite (worker: $VGI_CVE_WORKER) ..." -"$HAYBARN_UNITTEST" "test/sql/*" +# Run the whole suite in one invocation, capturing the runner's native +# sqllogictest report so we can both stream it AND guard against a silent skip. +# +# IMPORTANT: the DuckDB/Haybarn sqllogictest runner SKIPS (not fails, exit 0) a +# test whose error message matches a built-in network-error allowlist that +# includes the substring "HTTP". So a broken HTTP transport would otherwise show +# "All tests were skipped" and the job would go GREEN having run nothing — a +# fake pass. We detect that and fail explicitly. A real run prints +# "All tests passed (N assertions ...)". +echo "Running suite (transport: $TRANSPORT, worker: $VGI_CVE_WORKER) ..." +RUN_LOG="$STAGE/run.log" +set +e +"$HAYBARN_UNITTEST" "test/sql/*" 2>&1 | tee "$RUN_LOG" +RUN_RC="${PIPESTATUS[0]}" +set -e + +if [ "$RUN_RC" -ne 0 ]; then + echo "ERROR: suite failed (transport: $TRANSPORT, rc=$RUN_RC)" >&2 + exit "$RUN_RC" +fi + +# Guard against the silent-skip fake-pass (see comment above). If every test was +# skipped — and none ran — treat it as a failure for this transport, surfacing +# the skip reason the runner reported. +if grep -q 'All tests were skipped' "$RUN_LOG"; then + echo "ERROR: every test was SKIPPED on transport '$TRANSPORT' (the runner's" >&2 + echo " built-in network-error skip swallowed the real error). This is" >&2 + echo " NOT a pass. Skip reason reported by the runner:" >&2 + grep -A3 'Skipped tests for the following reasons' "$RUN_LOG" >&2 || true + exit 1 +fi diff --git a/cmd/vgi-cve-worker/main.go b/cmd/vgi-cve-worker/main.go index dae6566..3f6f7db 100644 --- a/cmd/vgi-cve-worker/main.go +++ b/cmd/vgi-cve-worker/main.go @@ -17,15 +17,18 @@ import ( ) func main() { - // Accept --http for HTTP transport; default is stdio. Unknown launcher - // flags are tolerated (the VGI extension varies argv to key its worker - // cache), so we filter to flags we actually define before parsing. + // Accept --http for HTTP transport and --unix for the AF_UNIX launcher + // transport; default is stdio. Unknown launcher flags are tolerated (the + // VGI extension varies argv to key its worker cache), so we filter to flags + // we actually define before parsing. httpMode := flag.Bool("http", false, "Run as an HTTP server instead of stdio") + unixPath := flag.String("unix", "", "Serve the AF_UNIX launcher transport on this socket path instead of stdio") logFlags := vgi.RegisterLoggingFlags(flag.CommandLine) _ = flag.CommandLine.Parse(filterKnownFlags(os.Args[1:], map[string]bool{ "log-level": true, "log-format": true, "log-logger": true, + "unix": true, })) if err := logFlags.Apply(); err != nil { log.Fatalf("logging flags: %v", err) @@ -46,6 +49,15 @@ func main() { } return } + if *unixPath != "" { + // AF_UNIX launcher transport: serve on the given socket path. The SDK + // prints "UNIX:" once listening; idleTimeout=0 disables the + // self-shutdown timer (the launcher/CI owns the process lifecycle). + if err := w.RunUnix(*unixPath, 0); err != nil { + log.Fatal(err) + } + return + } w.RunStdio() } From 566c50ac23a3b035355aea0fa9b5e897bbcb5620 Mon Sep 17 00:00:00 2001 From: Rusty Conover Date: Tue, 23 Jun 2026 20:23:04 -0400 Subject: [PATCH 2/2] ci(http): load httpfs + gate stateful streaming table-function test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The http leg revealed two real, transport-specific issues; resolve the first and gate the second (a genuine protocol limitation, never faked): 1. httpfs is required. The vgi extension drives the worker-RPC HTTP POSTs through DuckDB's HTTPUtil, which is only registered when httpfs is loaded. The .test files only `LOAD vgi`, so over HTTP every worker request failed with an "HTTP"-flavoured error that the runner silently skipped. The script now injects `INSTALL httpfs FROM core; LOAD httpfs;` after each `LOAD vgi;` for the http leg only. 2. cve_api.test is GATED on http (runs on subprocess/unix only). The cve/ cve_search/cpe_cves table functions stream their result across multiple Process exchanges, signalling end-of-stream with per-execution state (state.Done: first Process emits, next returns Finish()). The vgi HTTP transport is stateless — each RPC is independent, so the per-execution state does not persist across exchanges (the SDK itself disables deferred cleanup in HTTP mode: "no reliable stream-end signal"). Done resets every request, Process re-emits forever, and the worker spins re-binding indefinitely. This is the documented "partition-local state across exchanges" HTTP limitation, so we gate the file rather than fake a pass. The offline CVSS scalars are plain request/response and DO run over http. ci/README.md documents both. Local validation: subprocess GREEN (25), unix GREEN (25), http GREEN (14, offline scalars; table funcs gated). Co-Authored-By: Claude Opus 4.8 (1M context) --- ci/README.md | 28 +++++++++++++++++++++++ ci/run-integration.sh | 53 ++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 80 insertions(+), 1 deletion(-) diff --git a/ci/README.md b/ci/README.md index 3ef19cd..76490c6 100644 --- a/ci/README.md +++ b/ci/README.md @@ -34,6 +34,34 @@ regardless of how DuckDB talks to the worker, so the script always builds and starts `mockserver`, exports `VGI_CVE_TEST_URL`, and trap-kills it (plus any out-of-band worker process) on exit. +### HTTP transport specifics + +Two things are required for the **http** leg, both handled by +`run-integration.sh` automatically: + +1. **`httpfs` must be loaded.** The vgi extension drives the worker-RPC HTTP + POSTs through DuckDB's `HTTPUtil`, which is only registered once the signed + core `httpfs` extension is loaded. The `.test` files only `LOAD vgi`, so for + the http leg the script injects `INSTALL httpfs FROM core; LOAD httpfs;` + after each `LOAD vgi;` in the staged copies. Without it every worker request + fails with an `HTTP`-flavoured error that the runner silently skips. + +2. **`cve_api.test` is GATED on http** (runs on subprocess/unix only). + The `cve` / `cve_search` / `cpe_cves` table functions stream their result + across multiple `Process` exchanges and signal end-of-stream with + per-execution state (`state.Done`): the first `Process` emits the batch, the + next returns `Finish()`. The vgi extension's HTTP transport is **stateless** + — each RPC is an independent request, so the worker's per-execution state + does not persist between the two exchanges (the SDK itself disables its + deferred storage cleanup in HTTP mode: *"no reliable stream-end signal"*). + The `Done` flag resets on every request, `Process` re-emits the same batch + forever, and the scan never reaches `Finish()` — the worker spins + re-binding indefinitely. This is the documented *"partition-local state + across exchanges"* HTTP limitation, **not** a flaky failure, so we gate the + file rather than fake a pass. The offline CVSS scalars (`cvss_offline.test`) + are plain request/response with no streaming state and **do** run over http. + The gate list is `HTTP_GATED_TESTS` in `run-integration.sh`. + ### Silent-skip guard (no fake passes) The DuckDB/Haybarn sqllogictest runner **skips** (exit 0, not a failure) any diff --git a/ci/run-integration.sh b/ci/run-integration.sh index de1d75c..5bae259 100755 --- a/ci/run-integration.sh +++ b/ci/run-integration.sh @@ -160,13 +160,64 @@ case "$TRANSPORT" in ;; esac +# Tests GATED for the http transport (run on subprocess/unix only). See the +# block below and ci/README.md for the protocol reason — these are real +# stateless-HTTP limitations, not flaky failures, so we never fake a pass. +HTTP_GATED_TESTS="cve_api.test" + # --- Stage the preprocessed tests ------------------------------------------- echo "Staging preprocessed tests into $STAGE ..." mkdir -p "$STAGE/test/sql" for f in "$REPO"/test/sql/*.test; do - awk -f "$HERE/preprocess-require.awk" "$f" > "$STAGE/test/sql/$(basename "$f")" + base="$(basename "$f")" + # Gate stateful-streaming table-function tests out of the http leg. The cve/ + # cve_search/cpe_cves table functions stream their result across multiple + # Process exchanges, signalling end-of-stream with per-execution state + # (state.Done): the FIRST Process emits the batch, the NEXT returns Finish(). + # The vgi extension's HTTP transport is STATELESS — each RPC is an independent + # request, so the worker's per-execution state object does not persist across + # the two exchanges (the SDK itself disables deferred storage cleanup in HTTP + # mode: "no reliable stream-end signal"). The Done flag therefore resets every + # request, Process re-emits the same batch forever, and the scan never reaches + # Finish() — the worker spins re-binding indefinitely. This is the recipe's + # documented "partition-local state across exchanges" HTTP limitation. Run the + # offline-scalar coverage over http (it is request/response, no streaming + # state) and gate the table-function file to subprocess/unix. + if [ "$TRANSPORT" = "http" ]; then + gated="" + for g in $HTTP_GATED_TESTS; do [ "$g" = "$base" ] && gated=1; done + if [ -n "$gated" ]; then + echo "::notice::GATED on http: $base (stateful streaming table functions cannot stream over the stateless HTTP transport)" + continue + fi + fi + awk -f "$HERE/preprocess-require.awk" "$f" > "$STAGE/test/sql/$base" done +# The HTTP transport needs DuckDB's HTTP client, which the vgi extension drives +# through DuckDB's HTTPUtil — that is only registered when the `httpfs` +# extension is loaded. The .test files only `LOAD vgi`, so over HTTP the +# worker-RPC POSTs fail with an "HTTP"-flavoured error (which the runner then +# silently skips). Inject an explicit signed `INSTALL httpfs FROM core; LOAD +# httpfs;` after each `LOAD vgi;` in the staged tests for the http transport +# only (subprocess/unix do not use the HTTP client, so they need nothing extra). +if [ "$TRANSPORT" = "http" ]; then + echo "Transport http: injecting 'LOAD httpfs' (required for the worker HTTP RPC) ..." + for f in "$STAGE"/test/sql/*.test; do + awk ' + { print } + /^LOAD[ \t]+vgi;[ \t]*$/ { + print ""; + print "statement ok"; + print "INSTALL httpfs FROM core;"; + print ""; + print "statement ok"; + print "LOAD httpfs;"; + } + ' "$f" > "$f.tmp" && mv "$f.tmp" "$f" + done +fi + cd "$STAGE" # Warm the extension cache once: vgi from the signed community channel. A miss