A locally-runnable emulator of Google Cloud BigQuery, intended for local development and integration testing of applications that target the BigQuery REST API.
Status: preview (
v0.x). The Go REST gateway implements projects / datasets / tables / tabledata / jobs / queries end-to-end against the C++ engine, plus wired stubs for the surfaces client libraries probe at startup (models, routines, row-access policies, migration, data transfer, discovery). The C++ engine links GoogleSQL directly and runs a local execution coordinator behind a singleEngineinterface: a resolved-AST router dispatches each query shape to the strategy that fits — DuckDB-native SQL for fast analytical work, DuckDB UDFs / rewrites for the BigQuery functions we can polyfill, a local semantic executor for exact BigQuery evaluation, and storage/catalog handlers for DDL and metadata operations. Results land back as RESTf/vJSON or, on the internal gRPC Storage Read API, as native Arrow batches. DDL (CREATE TABLE,CREATE TABLE AS SELECT,DROP TABLE, …) and DML (INSERT,UPDATE,DELETE,MERGE, pipe INSERT, deep-STRUCTUPDATE,THEN RETURNon INSERT/UPDATE/DELETE) execute locally. GoogleSQL does not defineTHEN RETURNon MERGE. SeeROADMAP.mdfor the capability-area narrative and the documentation site (ordocs/README.md) for the full guide index.
This emulator is modeled directly on Google's
cloud-spanner-emulator:
+-------------------------------+ +-----------------------------------+
| gateway_main (Go) | gRPC | emulator_main (C++) |
| | <----> | |
| - Implements BigQuery REST | | - Links GoogleSQL directly |
| (projects/datasets/tables/ | | - Local execution coordinator: |
| jobs/queries/insertAll) | | - DuckDB fast path |
| - Spawns engine as subproc | | - DuckDB UDFs / rewrites |
| | | - Local semantic executor |
| | | - Catalog / control ops |
| | | - DuckDB-backed persistent store |
+-------------------------------+ +-----------------------------------+
- The engine is C++ so it can link GoogleSQL directly. SQL parsing, name resolution, and type inference come from upstream; execution dispatches through a local route classifier that picks the right local strategy for each resolved-AST shape.
- DuckDB is the fast analytical path, not the whole engine. Shapes that lower cleanly run there; shapes that need exact BigQuery semantics run on a local semantic executor; DDL and metadata ops go through the catalog/storage layer directly.
- The REST gateway is Go — REST routes, jobs lifecycle, datasets/tables/projects model, streaming inserts, error envelope, discovery doc.
- The Go gateway spawns the C++ engine as a subprocess on startup and
shuts it down cleanly on exit, identical to how
gateway_mainspawnsemulator_mainin the Spanner emulator. - The whole stack runs local-only — query work never forwards to a real BigQuery project.
See ROADMAP.md for the design rationale and
docs/ENGINE_POLICY.md for the route catalog.
The fastest path is the published Docker image — no Bazel build, no
GoogleSQL checkout. Architecture: release archives and the default
Docker image ship a linux/amd64 engine binary; Apple Silicon and
Graviton hosts should use the published Docker image (or wait for the
in-flight arm64 release lane — see docs/RELEASES.md).
docker run --rm -p 9050:9050 ghcr.io/vantaboard/bigquery-emulator:latest
# In another shell:
curl -fsS http://localhost:9050/healthz
curl -fsS -X POST http://localhost:9050/bigquery/v2/projects/test/queries \
-H 'Content-Type: application/json' \
-d '{"query":"SELECT 1 AS n","useLegacySql":false}'To build and run locally:
mise install # or: task tools:install
task emulator:build-engine:bazel # stages bin/emulator_main + bin/libduckdb.so
task emulator:run-full # gateway on :9050, engine on :9060The engine binary discovery defaults to looking for emulator_main next to
bigquery-emulator-gateway. Pass --engine_binary=<path> to override, or
--engine_binary="" to skip the engine subprocess entirely.
Next steps: Docker · Client libraries · Development setup · Releases
The bench/ harness compares query latency and correctness
across three backends: this emulator (vantaboard), the
goccy/bigquery-emulator
Docker image (0.8.1), and committed BigQuery golden baselines. Run
locally with task bench:run; see bench/README.md
for case format, baseline capture, phase timing, and profiling.
Charts below are regenerated by task bench:charts (and CI on main via
.github/workflows/bench.yml) from
bench/results.json. Live copies also publish to
gh-pages bench/.
Published guides: vantaboard.github.io/bigquery-emulator
| Topic | Guide |
|---|---|
| Full index (GitHub) | docs/README.md |
| REST API surface | docs/REST_API.md |
| Engine execution policy | docs/ENGINE_POLICY.md |
| Seeding & CLI flags | docs/SEEDING.md |
| Development & building | docs/DEVELOPMENT.md |
| Docker | docs/DOCKER.md |
| Client libraries | docs/CLIENTS.md |
| Releases & install | docs/RELEASES.md |
| Capability roadmap | ROADMAP.md |
MIT. See LICENSE.