BigQuery Emulator

A locally-runnable emulator of Google Cloud BigQuery, intended for local development and integration testing of applications that target the BigQuery REST API.

Status: preview (v0.x). The Go REST gateway implements projects / datasets / tables / tabledata / jobs / queries end-to-end against the C++ engine, plus wired stubs for the surfaces client libraries probe at startup (models, routines, row-access policies, migration, data transfer, discovery). The C++ engine links GoogleSQL directly and runs a local execution coordinator behind a single Engine interface: a resolved-AST router dispatches each query shape to the strategy that fits — DuckDB-native SQL for fast analytical work, DuckDB UDFs / rewrites for the BigQuery functions we can polyfill, a local semantic executor for exact BigQuery evaluation, and storage/catalog handlers for DDL and metadata operations. Results land back as REST f/v JSON or, on the internal gRPC Storage Read API, as native Arrow batches. DDL (CREATE TABLE, CREATE TABLE AS SELECT, DROP TABLE, …) and DML (INSERT, UPDATE, DELETE, MERGE, pipe INSERT, deep-STRUCT UPDATE, THEN RETURN on INSERT/UPDATE/DELETE) execute locally. GoogleSQL does not define THEN RETURN on MERGE. See ROADMAP.md for the capability-area narrative and the documentation site (or docs/README.md) for the full guide index.

Architecture

This emulator is modeled directly on Google's cloud-spanner-emulator:

+-------------------------------+        +-----------------------------------+
|  gateway_main (Go)            |  gRPC  |  emulator_main (C++)              |
|                               | <----> |                                   |
|  - Implements BigQuery REST   |        |  - Links GoogleSQL directly       |
|    (projects/datasets/tables/ |        |  - Local execution coordinator:   |
|     jobs/queries/insertAll)   |        |      - DuckDB fast path           |
|  - Spawns engine as subproc   |        |      - DuckDB UDFs / rewrites     |
|                               |        |      - Local semantic executor    |
|                               |        |      - Catalog / control ops      |
|                               |        |  - DuckDB-backed persistent store |
+-------------------------------+        +-----------------------------------+

The engine is C++ so it can link GoogleSQL directly. SQL parsing, name resolution, and type inference come from upstream; execution dispatches through a local route classifier that picks the right local strategy for each resolved-AST shape.
DuckDB is the fast analytical path, not the whole engine. Shapes that lower cleanly run there; shapes that need exact BigQuery semantics run on a local semantic executor; DDL and metadata ops go through the catalog/storage layer directly.
The REST gateway is Go — REST routes, jobs lifecycle, datasets/tables/projects model, streaming inserts, error envelope, discovery doc.
The Go gateway spawns the C++ engine as a subprocess on startup and shuts it down cleanly on exit, identical to how gateway_main spawns emulator_main in the Spanner emulator.
The whole stack runs local-only — query work never forwards to a real BigQuery project.

See ROADMAP.md for the design rationale and docs/ENGINE_POLICY.md for the route catalog.

Quickstart

The fastest path is the published Docker image — no Bazel build, no GoogleSQL checkout. Architecture: release archives and the default Docker image ship a linux/amd64 engine binary; Apple Silicon and Graviton hosts should use the published Docker image (or wait for the in-flight arm64 release lane — see docs/RELEASES.md).

docker run --rm -p 9050:9050 ghcr.io/vantaboard/bigquery-emulator:latest

# In another shell:
curl -fsS http://localhost:9050/healthz
curl -fsS -X POST http://localhost:9050/bigquery/v2/projects/test/queries \
    -H 'Content-Type: application/json' \
    -d '{"query":"SELECT 1 AS n","useLegacySql":false}'

To build and run locally:

mise install                              # or: task tools:install
task emulator:build-engine:bazel          # stages bin/emulator_main + bin/libduckdb.so
task emulator:run-full                    # gateway on :9050, engine on :9060

The engine binary discovery defaults to looking for emulator_main next to bigquery-emulator-gateway. Pass --engine_binary=<path> to override, or --engine_binary="" to skip the engine subprocess entirely.

Next steps: Docker · Client libraries · Development setup · Releases

Benchmarks

The bench/ harness compares query latency and correctness across three backends: this emulator (vantaboard), the goccy/bigquery-emulator Docker image (0.8.1), and committed BigQuery golden baselines. Run locally with task bench:run; see bench/README.md for case format, baseline capture, phase timing, and profiling.

Charts below are regenerated by task bench:charts (and CI on main via .github/workflows/bench.yml) from bench/results.json. Live copies also publish to gh-pages bench/.

Latency comparison (p50, ms)

Engine phase breakdown (vantaboard)

Documentation

Published guides: vantaboard.github.io/bigquery-emulator

Topic	Guide
Full index (GitHub)	`docs/README.md`
REST API surface	`docs/REST_API.md`
Engine execution policy	`docs/ENGINE_POLICY.md`
Seeding & CLI flags	`docs/SEEDING.md`
Development & building	`docs/DEVELOPMENT.md`
Docker	`docs/DOCKER.md`
Client libraries	`docs/CLIENTS.md`
Releases & install	`docs/RELEASES.md`
Capability roadmap	`ROADMAP.md`

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1,124 Commits
.cursor		.cursor
.github		.github
.vscode		.vscode
backend		backend
bench		bench
binaries		binaries
conformance		conformance
docker		docker
docs		docs
frontend		frontend
gateway		gateway
proto		proto
scripts		scripts
taskfiles		taskfiles
testdata/public-data		testdata/public-data
third_party		third_party
tools		tools
.air.toml		.air.toml
.bazelrc		.bazelrc
.bazelversion		.bazelversion
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clangd		.clangd
.cursorrules		.cursorrules
.dockerignore		.dockerignore
.envrc		.envrc
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yml		.goreleaser.yml
.releaserc.yml		.releaserc.yml
BUILD.bazel		BUILD.bazel
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MODULE.bazel		MODULE.bazel
MODULE.bazel.lock		MODULE.bazel.lock
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
Taskfile.yml		Taskfile.yml
buf.gen.yaml		buf.gen.yaml
buf.yaml		buf.yaml
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
googlesql_deps.bzl		googlesql_deps.bzl
mise.toml		mise.toml
mkdocs.yml		mkdocs.yml
pyrightconfig.json		pyrightconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BigQuery Emulator

Architecture

Quickstart

Benchmarks

Latency comparison (p50, ms)

Engine phase breakdown (vantaboard)

Documentation

License

About

Uh oh!

Releases 16

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BigQuery Emulator

Architecture

Quickstart

Benchmarks

Latency comparison (p50, ms)

Engine phase breakdown (vantaboard)

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages