Baywire

The live wire for Tampa Bay.

A unified guide to live music, festivals, food, and family fun across the Tampa Bay area — Tampa, St. Petersburg, Clearwater, Brandon, Bradenton, Safety Harbor, Dunedin, and an Other catch-all for edge cases. Listings are aggregated daily from multiple sources, deduplicated, and ranked for readability. Curated Places (beaches, venues, food, and similar) ship on /places via a separate discovery pipeline. Lives at baywire.app.

Baywire runs 18 event adapters (src/lib/scrapers/index.ts). The daily matrix includes only sources that are enabled in the database and resolve through the adapter registry (scripts/ci/scrape-matrix.ts). Each job tries structured data first (JSON-LD, Tribe REST, Ticketmaster Discovery API, iCal) and falls back to OpenAI extraction when needed; venue rows are upserted into Places as events are processed. Taxonomy (config/taxonomy.json) drives tags, vibes, discovery verticals, and editorial re-classification — see ARCHITECTURE.md for the full ingestion design.

Vercel hosts the read-only Next.js app — scrapes, discovery, and backfill run on GitHub Actions, not on Vercel.

  taxonomy.json ──publish──▶ DB taxonomy + backfill_jobs
                                    │
  GHA scrape (daily) ───────────────┼──▶ events / canonical_events
  GHA discover (weekly) ────────────┼──▶ places
  GHA backfill (every 6h) ──────────┘    (classify, sanitize, new profiles)
                                    │
                                    ▼
                         Prisma Postgres + Accelerate
                                    │
                                    ▼
                         Vercel Next.js (read-only)

Stack

Next.js 16 (App Router, React 19, RSC by default) + Serwist (@serwist/turbopack) for the PWA shell
proxy.ts — anonymous guest profile cookie bootstrap
Tailwind CSS v4 + custom coastal palette
Prisma ORM + Prisma Postgres + Prisma Accelerate; URL in prisma.config.ts for Prisma 7 CLI
OpenAI gpt-4.1-mini with Zod-typed structured outputs (OPENAI_BASE_URL for compatible proxies)
Google Places API (New) + Vercel Blob for place discovery imagery
Stytch for SMS sign-in (optional locally)
cheerio, p-limit, Playwright for browser/WAF adapters
GitHub Actions — scrape matrix, places discovery, taxonomy backfill, cleanup

Sources

Slug	Site	Path	Notes
`eventbrite`	eventbrite.com	JSON-LD	Geo-search across metro cities, 2 pages each
`ticketmaster`	ticketmaster.com/discover/tampa	Discovery API	DMA 635 (Tampa-St. Pete-Sarasota)
`visit_tampa_bay`	visittampabay.com/events	JSON-LD	Official tourism
`visit_st_pete_clearwater`	visitstpeteclearwater.com	JSON-LD	`/events` + `/events-festivals`
`tampa_gov`	tampa.gov/calendar	JSON-LD + ICS	City calendar
`ilovetheburg`	ilovetheburg.com	Tribe REST API	St. Pete blog
`thats_so_tampa`	thatssotampa.com	Tribe REST API	Tampa-side blog
`tampa_bay_times`	tampabay.com/things-to-do	HTML + LLM	Editorial weekend picks
`tampa_bay_markets`	tampabaymarkets.com	HTML + LLM	Farmers' markets
`safety_harbor`	cityofsafetyharbor.com	RSS + LLM	CivicPlus feed
`side_splitters`	sidesplitterscomedy.com	HTML + LLM	Comedy club
`dont_tell_comedy`	donttellcomedy.com	HTML + LLM	Pop-up comedy
`funny_bone_tampa`	tampa.funnybone.com	HTML + LLM	DataDome; optional cookie secret in CI
`straz_center`	strazcenter.org	HTML + LLM	Playwright / Incapsula
`tampa_theatre`	tampatheatre.org	HTML + LLM	Live events + detail pages

Browser-powered sources: dunedin_gov, unation, feverup, straz_center, funny_bone_tampa, visit_tampa_bay (listing) — see ARCHITECTURE.md.

Local setup

npm install
cp .env.example .env.local
# Required: DATABASE_URL, OPENAI_API_KEY
# Optional: TICKETMASTER_API_KEY, GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN, STYTCH_*, CRON_SECRET

npm run db:migrate:dev    # or db:push for a quick schema sync
npm run ingestion:taxonomy-publish   # seed taxonomy tables from config/taxonomy.json

npm run ingestion:scrape            # full scrape (or: npm run ingestion:scrape -- eventbrite)
# Optional: INLINE_CLASSIFY=1 for synchronous editorial during scrape

npm run dev

Open http://localhost:3000.

Prisma Postgres

Create a database at console.prisma.io.
Set DATABASE_URL to the prisma+postgres://accelerate... URL.
npm run db:migrate:dev (or db:push in early dev).

Useful scripts

Command	What it does
`npm run dev`	Next.js dev server
`npm run build`	Production build (`postinstall` runs `prisma generate`)
`npm run typecheck`	`tsc --noEmit`
`npm run lint`	ESLint
`npm run db:migrate:dev`	Create/apply a dev migration
`npm run db:migrate`	Apply migrations (production)
`npm run db:studio`	Prisma Studio
`npm run ingestion:scrape [-- <slug>]`	Event scrape (one source or all enabled)
`npm run ingestion:discover`	Google Places discovery (`--help`)
`npm run ingestion:refresh`	Re-verify existing discovery places
`npm run ingestion:backfill`	Drain backfill queue (`--limit`, `--kind`)
`npm run ingestion:taxonomy-publish`	Publish `config/taxonomy.json` → DB + enqueue diff jobs
`npm run ingestion:cleanup`	Delete stale events / places (`--skip-places`)
`npm run ingestion:matrix`	Emit GHA scrape matrix JSON (CI only)
`npm run ops:blob-purge`	Purge Vercel Blob prefix (`--execute` to delete)

Taxonomy (quick reference)

Edit config/taxonomy.json — terms, aliases, discovery profiles, prompt bundles, rankingGuides. Bump version (and promptRevision when prompts change), then:

npm run ingestion:taxonomy-publish
npm run ingestion:backfill -- --limit 500

Async classify (default): scrape/discover enqueue classify_* jobs; backfill runs on a schedule.
INLINE_CLASSIFY=1: run editorial inline during scrape/discover/refresh (local debugging).

Details: ARCHITECTURE.md — Taxonomy and Classification fingerprints.

Deployment

Production splits Vercel (HTTP) from GitHub Actions (scheduled writes).

Vercel

Import repo; set DATABASE_URL (and optional CRON_SECRET, Stytch, Blob, Google keys).
No Vercel crons for scrapes — vercel.json is empty.

GitHub Actions

See .github/workflows/README.md for the full index.

Workflow	Schedule (UTC)	Command
ingestion-scrape.yml	Daily 12:00	`ingestion:scrape` (matrix)
ingestion-discover.yml	Sun 08:00	`ingestion:discover`
ingestion-backfill.yml	Every 6h	`ingestion:backfill`
ingestion-taxonomy-publish.yml	Push `config/taxonomy.json` → `main`	`ingestion:taxonomy-publish`
ingestion-cleanup.yml	Sun 09:00	`ingestion:cleanup`

Scrape secrets: DATABASE_URL, OPENAI_API_KEY; optional OPENAI_BASE_URL, OPENAI_EXTRACT_MODEL, TICKETMASTER_API_KEY, FUNNYBONE_SCRAPE_COOKIE.

Discover secrets: add GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN.

Backfill secrets: DATABASE_URL, OPENAI_API_KEY (for classify jobs).

Manual scrape trigger

workflow_dispatch on scrape workflow with optional source slug.
POST /api/cron/scrape with Authorization: Bearer $CRON_SECRET (202 + background after).

Project layout

ARCHITECTURE.md           Ingestion, taxonomy, pipelines (this doc)
config/taxonomy.json      Taxonomy draft (publish to DB)
proxy.ts                  Guest profile cookie bootstrap
src/
  app/                    Next.js routes, UI, metrics, cron API
  ingestion/              Taxonomy, backfill queue, pipeline entrypoints, adapters
    taxonomy/             Snapshot, publish, diff, sanitize, validate
    queue/                enqueue + process backfill jobs
    pipelines/            events/scrape, places/discover|refresh, maintenance
    adapters/             resolveAdapter (tribe, jsonld, custom, …)
    kernel/               classification fingerprint
  lib/
    pipeline/             Scrape, canonicalize, editorial orchestration
    scrapers/             Per-source adapters
    extract/              OpenAI extraction + editorial
    db/                   Prisma client + queries (+ queriesTaxonomy)
    places/               Google Places + discovery helpers
prisma/schema.prisma      baywire schema
.github/workflows/        Scheduled ingestion — see .github/workflows/README.md
.github/actions/          Composite steps (install, playwright-chromium)
scripts/                  CLI entrypoints — see scripts/README.md
  ingestion/              scrape, discover, refresh, backfill, taxonomy-publish
  maintenance/            cleanup-expired
  ci/                     scrape-matrix (GHA)
  ops/                    blob utilities
  _lib/                   shared runCli + arg helpers

Cost & rate posture

Per-host pacing ~1 req / 1.1s; extraction concurrency 4.
Structured-first adapters avoid LLM calls when JSON-LD/ICS/API succeeds.
Content-hash skips re-extraction when upstream payload unchanged.
Classification fingerprint skips editorial only when content + taxonomy version + prompt revision match — bumping taxonomy version triggers re-classify via backfill (not a full re-scrape).
Reduced HTML capped at 16k chars before gpt-4.1-mini.
Read helpers use Accelerate cacheStrategy where noted in query modules.

Attribution & ToS

This project respects each source's robots.txt and only fetches public listing pages. Event cards link to originals; the footer lists enabled sources from the database. For removal requests, disable the adapter or open an issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Baywire

Stack

Sources

Local setup

Prisma Postgres

Useful scripts

Taxonomy (quick reference)

Deployment

Vercel

GitHub Actions

Manual scrape trigger

Project layout

Cost & rate posture

Attribution & ToS

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github		.github
config		config
prisma		prisma
public		public
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
ARCHITECTURE.md		ARCHITECTURE.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
prisma.config.ts		prisma.config.ts
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Folders and files

Latest commit

History

Repository files navigation

Baywire

Stack

Sources

Local setup

Prisma Postgres

Useful scripts

Taxonomy (quick reference)

Deployment

Vercel

GitHub Actions

Manual scrape trigger

Project layout

Cost & rate posture

Attribution & ToS

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages