Analytics Hub: a governed reference architecture for hosting LLM-generated dashboards in Snowflake

A thin Streamlit-in-Snowflake (SiS) shell that hosts pre-built HTML dashboards authored anywhere (Claude, v0, a notebook, by hand) and adds the four things those artifacts don't ship with on their own: role-based access, refresh on real data, audit logging, and a chat sidebar that only runs verified SQL.

The render core is ~260 lines of Python (the Cortex Analyst chat sidebar adds ~600 more). Each page is three files in a folder (four with an optional semantic view for the chat sidebar).

This repo is a personal POC, not a deployed product. It demonstrates a pattern; the demo data is synthetic; the brand ("Lumen AI") is fictional.

What's inside

apps/
  pipeline_health/     hero B2B SaaS pipeline & revenue board (with inline account drill-in)
  rep_performance/     rep scorecard: attainment & bookings trends, YTD/monthly/quarterly (RBAC demo)
shared/                render core (manifest loader, page runner) + chat sidebar
scripts/setup_demo.sql one-paste-and-run synthetic data seed
fixtures/*.json        pre-generated data so `make local` works offline

Each apps/<name>/ folder has the same four files:

File	Role
`manifest.yml`	title, sidebar order, iframe height, ordered column list, `allowed_roles`, `chip_count`
`page.html`	self-contained HTML/CSS/JS with a literal `__DATA__` placeholder
`query.sql`	Snowflake SQL whose result populates `__DATA__`
`semantic_view.sql` (optional)	`CREATE OR REPLACE SEMANTIC VIEW` for the Cortex Analyst chat sidebar

home.py discovers everything under apps/, registers each page via st.navigation(), runs the SQL, injects the JSON result into the HTML's __DATA__ token, and renders via streamlit.components.v1.html.

Try it offline (60 seconds, no Snowflake)

git clone <this repo>
cd analytics-hub

# any Python 3.11 environment with these will work; conda is below
conda env create -f environment.yml
conda activate analytics_hub

make local

A browser opens. Two pages render against pre-generated synthetic fixtures in fixtures/*.json. No Snowflake account required. The chat sidebar is disabled in this mode (it needs Cortex Analyst).

To simulate a different role and see the nav list change:

LOCAL_DEV_ROLE=LUMEN_SALES_LEADER make local   # won't see Rep Performance
LOCAL_DEV_ROLE=LUMEN_ANALYST      make local   # sees everything

Run the full Snowflake stack

This path lights up Cortex Analyst (the chat sidebar) and proves out the end-to-end deploy. Free 30-day Snowflake Enterprise trial works fine.

Prerequisites

Snowflake Enterprise account. Sign up at https://signup.snowflake.com. Pick AWS us-west-2 or AWS us-east-1; those regions have Cortex Analyst availability.
Snowflake CLI (≥ 3.4): pip install snowflake-cli.
Python 3.11 with conda or uv. The environment.yml is the SiS runtime contract, the same dep set the deployed app will resolve.

Steps

Seed the demo data. Open a Snowflake worksheet, paste scripts/setup_demo.sql, run it. Takes about 30 seconds. It creates:
- LUMEN_DEMO database + ANALYTICS_HUB schema
- ANALYTICS_HUB_WH warehouse (XSMALL, auto-suspend 10 min)
- 4 roles (LUMEN_SALES_LEADER, LUMEN_ANALYST, LUMEN_ENGINEER, LUMEN_CHAT_RUNNER)
- Seed dimension + fact tables (~4,300 deterministic rows)
- Two mart tables (one per dashboard)

Set up a Snow CLI connection.

# ~/.snowflake/connections.toml
[connections.LUMEN]
account       = "<your account locator>"
user          = "<your user>"
authenticator = "externalbrowser"
role          = "LUMEN_ENGINEER"
warehouse     = "ANALYTICS_HUB_WH"
database      = "LUMEN_DEMO"
schema        = "ANALYTICS_HUB"

Bootstrap the audit log + semantic views, then deploy.

snow sql -f scripts/setup_audit_log.sql -c LUMEN
make deploy ENV=dev
make url    ENV=dev    # prints the SiS app URL

Open the URL. Switch the active role via the Snowsight dropdown to see the nav list change. Click a chip in the chat sidebar to confirm a verified Cortex Analyst query executes against the semantic view.

Authoring a new page

The artifact you'd add to a PR is just the three (or four) files. The simplest path is to ask an LLM to produce them; see docs/AUTHORING_PROMPT.md for a prompt that gets you a starting layout. An HTML page built in a chat tool rarely arrives in this shape: expect to strip CDN <script> tags, point the JS at the injected RAW array instead of baked-in data, expand any SELECT *, and add the __DATA__ token. That reshaping is mechanical: AUTHORING_PROMPT.md's Prompt B does most of it, and scripts/validate_apps.py catches anything you miss before the PR. From there:

make scaffold name=my_page         # creates an empty skeleton
# drop the LLM's HTML/SQL/manifest into apps/my_page/
make sample   name=my_page         # runs the SQL, writes fixtures/my_page.json
make preview  name=my_page         # opens the page in a browser; instant feedback
make semantic ENV=dev              # deploys the CREATE SEMANTIC VIEW DDL
make local                         # full Streamlit shell w/ chat sidebar

Open a PR. CI lints, validates the file contract, dry-runs every query.sql against Snowflake (SELECT * FROM (...) LIMIT 0), deploys to a per-PR isolated Streamlit object, and comments the preview URL on the PR. The isolated app is dropped automatically when the PR closes.

Architecture notes (read these before extending)

`manifest.yml` is a contract

The column list in manifest.yml is the contract between query.sql and page.html. The runner reads rows back from Snowflake and hands the page a 2D array whose column order matches the manifest. The HTML never sees a column it didn't declare; the SQL never returns one without an owner.

scripts/validate_apps.py enforces this in CI: it rejects SELECT *, rejects <alias>.*, and demands the literal __DATA__ token in the HTML.

RBAC at the data layer, not the app layer

manifest.allowed_roles is a UX gate: it controls which pages a user sees in the nav. The actual security boundary is GRANT SELECT on the underlying mart table. If you forget the GRANT, the page just renders empty when a permitted role tries to view it; if you forget allowed_roles, the page is visible but the query fails. Both are ugly; only one leaks data.

One semantic view per app

Each app has its own CREATE OR REPLACE SEMANTIC VIEW DDL. Per-app means each dashboard owner can evolve their semantics without coordinating with every other dashboard owner; the cost is duplicated dimensions across views. That's the right trade-off for independent change velocity.

The ai_verified_queries block inside the DDL doubles as the suggestion chips on the chat sidebar. Those queries are reviewed in PR, executed exactly as written, and their result is the trusted answer. Free-form chat questions still work, but the chips are guaranteed correct, and they are the path of least resistance for stakeholders.

`<% … %>` templating in `semantic_view.sql`

Three placeholders are bound at deploy time:

<% database %>: LUMEN_DEMO
<% schema %>: ANALYTICS_HUB
<% running_role %>: LUMEN_CHAT_RUNNER (the role Cortex Analyst runs queries as)

This lets one DDL file deploy to dev, prod, or any per-PR pipeline.

`ensure_ascii=True` is load-bearing

shared/page_runner.py JSON-encodes the result set with ensure_ascii=True specifically so that any </script> substring inside a string column is escaped as <\/script>, not embedded literally. Without it, a single unlucky account name like "foo</script><script>alert(1)</script>" would break out of the embedded HTML.

Audit log

Every chat turn writes one row to CHAT_AUDIT_LOG: (ts, user, role, app_name, prompt, generated_sql, row_count, latency_ms, error). Bootstrap with scripts/setup_audit_log.sql. Logging failures are caught and swallowed, so the chat path never blocks on audit.

-- weekly chat usage
SELECT app_name, COUNT(*) AS turns, AVG(latency_ms) AS avg_ms,
       COUNTIF(error IS NOT NULL) / COUNT(*) AS error_rate
FROM LUMEN_DEMO.ANALYTICS_HUB.CHAT_AUDIT_LOG
WHERE ts >= DATEADD(day, -7, CURRENT_DATE)
GROUP BY 1;

CI: per-PR isolation, manual prod gate

.github/workflows/ci.yml:

Job	When	What
`lint`	every push / PR	yamllint + sqlfluff
`validate`	every push / PR	manifest schema, `__DATA__` token, no `SELECT *`, pytest
`deploy_dev`	PR open / sync	dry-runs every `query.sql`, deploys per-PR Streamlit object suffixed `_PR_<number>`, comments URL on PR
`smoke`	after `deploy_dev`	confirms the per-PR object is reachable
`deploy_prod`	push to `main`	deploys to canonical name; gated on `environment: production` reviewer approval

.github/workflows/cleanup.yml drops the per-PR Streamlit object when the PR closes.

CI credentials

CI authenticates as a Snowflake service user (key-pair, no password) that holds LUMEN_ENGINEER. setup_demo.sql grants that role the CREATE STREAMLIT / SEMANTIC VIEW / STAGE it needs to deploy. One-time setup:

# generate a key pair (private key -> GitHub secret, public key -> Snowflake)
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out ci_key.p8 -nocrypt
openssl rsa -in ci_key.p8 -pubout -out ci_key.pub

USE ROLE ACCOUNTADMIN;
CREATE USER IF NOT EXISTS SVC_ANALYTICS_HUB_CI
  TYPE = SERVICE DEFAULT_ROLE = LUMEN_ENGINEER DEFAULT_WAREHOUSE = ANALYTICS_HUB_WH;
ALTER USER SVC_ANALYTICS_HUB_CI SET RSA_PUBLIC_KEY='<body of ci_key.pub, no header/footer>';
GRANT ROLE LUMEN_ENGINEER TO USER SVC_ANALYTICS_HUB_CI;

Then add the GitHub secrets (Settings → Secrets and variables → Actions):

Secret	Scope	Value
`SNOWFLAKE_ACCOUNT`	repository	your account locator
`SNOWFLAKE_USER`	repository	`SVC_ANALYTICS_HUB_CI`
`SNOWFLAKE_PRIVATE_KEY_DEV`	environment `dev`	full `ci_key.p8` contents
`SNOWFLAKE_PRIVATE_KEY_PROD`	environment `production`	same key (or a separate prod key)

Create the dev and production environments and add required reviewers to production — that approval is the manual prod gate.

Make targets

Target	What it does
`make local`	`LOCAL_DEV=1 streamlit run home.py`: runs against fixtures, no Snowflake
`make scaffold name=foo`	Create `apps/foo/{manifest.yml, page.html, query.sql}` from templates
`make sample name=foo`	Run `apps/foo/query.sql` via Snow CLI; write `fixtures/foo.json`
`make preview name=foo`	Open `apps/foo/page.html` in a browser with `fixtures/foo.json` injected
`make semantic ENV=dev\|prod`	Deploy each app's `CREATE SEMANTIC VIEW`
`make deploy ENV=dev\|prod`	Refresh semantic views, then `snow streamlit deploy …`
`make url ENV=dev\|prod`	Print the deployed SiS app URL
`make test`	`pytest`
`make clean`	Remove generated fixtures

How this fits next to BI tools

Modern BI suites (Looker with Gemini, Tableau Pulse, Power BI Copilot, ThoughtSpot Sage, Hex Magic) are converging on chat-with-your-data. If your team already lives inside one of those tools and authors there, you do not need this. The use case this reference architecture addresses is narrower:

You have artifacts authored outside a BI tool (in Claude, Cursor, a notebook, by hand) and you want to host them with governance and chat without rebuilding them in the BI tool.
You want the data path, RBAC, audit log, and semantic layer to live in your warehouse, not duplicated in a BI tool's modeling layer.
You don't want to pay per-seat for a BI tool just to host a few stakeholder dashboards.

This is complementary to a BI tool, not a replacement.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
apps		apps
docs		docs
fixtures		fixtures
scripts		scripts
shared		shared
tests		tests
.gitignore		.gitignore
.sqlfluff		.sqlfluff
.sqlfluffignore		.sqlfluffignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml
home.py		home.py
pyproject.toml		pyproject.toml
snowflake.yml		snowflake.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analytics Hub: a governed reference architecture for hosting LLM-generated dashboards in Snowflake

What's inside

Try it offline (60 seconds, no Snowflake)

Run the full Snowflake stack

Prerequisites

Steps

Authoring a new page

Architecture notes (read these before extending)

`manifest.yml` is a contract

RBAC at the data layer, not the app layer

One semantic view per app

`<% … %>` templating in `semantic_view.sql`

`ensure_ascii=True` is load-bearing

Audit log

CI: per-PR isolation, manual prod gate

CI credentials

Make targets

How this fits next to BI tools

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Analytics Hub: a governed reference architecture for hosting LLM-generated dashboards in Snowflake

What's inside

Try it offline (60 seconds, no Snowflake)

Run the full Snowflake stack

Prerequisites

Steps

Authoring a new page

Architecture notes (read these before extending)

manifest.yml is a contract

RBAC at the data layer, not the app layer

One semantic view per app

<% … %> templating in semantic_view.sql

ensure_ascii=True is load-bearing

Audit log

CI: per-PR isolation, manual prod gate

CI credentials

Make targets

How this fits next to BI tools

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`manifest.yml` is a contract

`<% … %>` templating in `semantic_view.sql`

`ensure_ascii=True` is load-bearing

Packages