feat(filters): add TOML filters for bq query and bq show with CSV compression by fkztw · Pull Request #896 · rtk-ai/rtk

fkztw · 2026-03-28T09:40:11Z

Description

This PR addresses token bloat from BigQuery commands bq query and bq show by introducing TOML filters designed to mitigate noise and aggressively compress output schema and large results.

Inspired by the TOON (Token-Oriented Object Notation) project format architecture, this filter drops the inherently expensive ASCII table layouts and structural paddings typical of BigQuery CLI outputs.
By transforming the output into a more streamlined, CSV-like footprint during the proxy ingestion phase, we achieved the following optimizations:

Aggressive Noise Stripping: Excludes ubiquitous gcloud update warnings, BQ job submission statuses, and purely decorative ASCII borders (+---+---+).
TOON-Style Tabular Conversion: Dynamically evaluates standard CLI tables and replaces internal | paddings with dense comma-separated syntax. This successfully drops raw token per-line consumption by up to 40-80% depending on row width.
Increased Safeguard Windows: Due to significantly lower token burn per row, this modification raises the truncation limits max_lines safely from standard bounds up to 100 lines. This empowers LLM reasoning by providing 2.5x more rows of context for the same historical compute bandwidth.
Schema & Structured Log Resilience: Hardened tests using realistic, anonymized LTA/GA4 offline datasets confirm that multi-line JSON structures generated via REPEATED RECORDS smoothly translate effectively down into valid sparse rows.

Testing & Verification

Inline unit tests were expanded to comprehensively cover various complex schemas (JSON multi-line payload representations and massive clustered dataset partition listings). Real-world anonymized queries gathered from our local engineering team have validated our aggressive savings metric assumptions.

- Filters out noise like gcloud update warnings and job progress status - Implements max_lines=40 and truncate_lines_at=120 to guard against large payloads - Registers these filters in discover/rules.rs to track savings - Adjusts test suite counts to account for the new filters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(filters): add TOML filters for bq query and bq show with CSV compression#896

feat(filters): add TOML filters for bq query and bq show with CSV compression#896
fkztw wants to merge 1 commit intortk-ai:developfrom
fkztw:feat/bq-toml-filter

fkztw commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fkztw commented Mar 28, 2026

Description

Testing & Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant