The live wire for Tampa Bay.
A unified guide to live music, festivals, food, and family fun across the Tampa Bay area — Tampa, St. Petersburg, Clearwater, Brandon, Bradenton, Safety Harbor, Dunedin, and an Other catch-all for edge cases. Listings are aggregated daily from multiple sources, deduplicated, and ranked for readability. Curated Places (beaches, venues, food, and similar) ship on /places via a separate discovery pipeline. Lives at baywire.app.
Baywire runs 18 event adapters (src/lib/scrapers/index.ts). The daily matrix includes only sources that are enabled in the database and resolve through the adapter registry (scripts/ci/scrape-matrix.ts). Each job tries structured data first (JSON-LD, Tribe REST, Ticketmaster Discovery API, iCal) and falls back to OpenAI extraction when needed; venue rows are upserted into Places as events are processed. Taxonomy (config/taxonomy.json) drives tags, vibes, discovery verticals, and editorial re-classification — see ARCHITECTURE.md for the full ingestion design.
Vercel hosts the read-only Next.js app — scrapes, discovery, and backfill run on GitHub Actions, not on Vercel.
taxonomy.json ──publish──▶ DB taxonomy + backfill_jobs
│
GHA scrape (daily) ───────────────┼──▶ events / canonical_events
GHA discover (weekly) ────────────┼──▶ places
GHA backfill (every 6h) ──────────┘ (classify, sanitize, new profiles)
│
▼
Prisma Postgres + Accelerate
│
▼
Vercel Next.js (read-only)
- Next.js 16 (App Router, React 19, RSC by default) + Serwist (
@serwist/turbopack) for the PWA shell proxy.ts— anonymous guest profile cookie bootstrap- Tailwind CSS v4 + custom coastal palette
- Prisma ORM + Prisma Postgres + Prisma Accelerate; URL in
prisma.config.tsfor Prisma 7 CLI - OpenAI
gpt-4.1-miniwith Zod-typed structured outputs (OPENAI_BASE_URLfor compatible proxies) - Google Places API (New) + Vercel Blob for place discovery imagery
- Stytch for SMS sign-in (optional locally)
cheerio,p-limit, Playwright for browser/WAF adapters- GitHub Actions — scrape matrix, places discovery, taxonomy backfill, cleanup
| Slug | Site | Path | Notes |
|---|---|---|---|
eventbrite |
eventbrite.com | JSON-LD | Geo-search across metro cities, 2 pages each |
ticketmaster |
ticketmaster.com/discover/tampa | Discovery API | DMA 635 (Tampa-St. Pete-Sarasota) |
visit_tampa_bay |
visittampabay.com/events | JSON-LD | Official tourism |
visit_st_pete_clearwater |
visitstpeteclearwater.com | JSON-LD | /events + /events-festivals |
tampa_gov |
tampa.gov/calendar | JSON-LD + ICS | City calendar |
ilovetheburg |
ilovetheburg.com | Tribe REST API | St. Pete blog |
thats_so_tampa |
thatssotampa.com | Tribe REST API | Tampa-side blog |
tampa_bay_times |
tampabay.com/things-to-do | HTML + LLM | Editorial weekend picks |
tampa_bay_markets |
tampabaymarkets.com | HTML + LLM | Farmers' markets |
safety_harbor |
cityofsafetyharbor.com | RSS + LLM | CivicPlus feed |
side_splitters |
sidesplitterscomedy.com | HTML + LLM | Comedy club |
dont_tell_comedy |
donttellcomedy.com | HTML + LLM | Pop-up comedy |
funny_bone_tampa |
tampa.funnybone.com | HTML + LLM | DataDome; optional cookie secret in CI |
straz_center |
strazcenter.org | HTML + LLM | Playwright / Incapsula |
tampa_theatre |
tampatheatre.org | HTML + LLM | Live events + detail pages |
Browser-powered sources: dunedin_gov, unation, feverup, straz_center, funny_bone_tampa, visit_tampa_bay (listing) — see ARCHITECTURE.md.
npm install
cp .env.example .env.local
# Required: DATABASE_URL, OPENAI_API_KEY
# Optional: TICKETMASTER_API_KEY, GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN, STYTCH_*, CRON_SECRET
npm run db:migrate:dev # or db:push for a quick schema sync
npm run ingestion:taxonomy-publish # seed taxonomy tables from config/taxonomy.json
npm run ingestion:scrape # full scrape (or: npm run ingestion:scrape -- eventbrite)
# Optional: INLINE_CLASSIFY=1 for synchronous editorial during scrape
npm run devOpen http://localhost:3000.
- Create a database at console.prisma.io.
- Set
DATABASE_URLto theprisma+postgres://accelerate...URL. npm run db:migrate:dev(ordb:pushin early dev).
| Command | What it does |
|---|---|
npm run dev |
Next.js dev server |
npm run build |
Production build (postinstall runs prisma generate) |
npm run typecheck |
tsc --noEmit |
npm run lint |
ESLint |
npm run db:migrate:dev |
Create/apply a dev migration |
npm run db:migrate |
Apply migrations (production) |
npm run db:studio |
Prisma Studio |
npm run ingestion:scrape [-- <slug>] |
Event scrape (one source or all enabled) |
npm run ingestion:discover |
Google Places discovery (--help) |
npm run ingestion:refresh |
Re-verify existing discovery places |
npm run ingestion:backfill |
Drain backfill queue (--limit, --kind) |
npm run ingestion:taxonomy-publish |
Publish config/taxonomy.json → DB + enqueue diff jobs |
npm run ingestion:cleanup |
Delete stale events / places (--skip-places) |
npm run ingestion:matrix |
Emit GHA scrape matrix JSON (CI only) |
npm run ops:blob-purge |
Purge Vercel Blob prefix (--execute to delete) |
Edit config/taxonomy.json — terms, aliases, discovery profiles, prompt bundles, rankingGuides. Bump version (and promptRevision when prompts change), then:
npm run ingestion:taxonomy-publish
npm run ingestion:backfill -- --limit 500- Async classify (default): scrape/discover enqueue
classify_*jobs; backfill runs on a schedule. INLINE_CLASSIFY=1: run editorial inline during scrape/discover/refresh (local debugging).
Details: ARCHITECTURE.md — Taxonomy and Classification fingerprints.
Production splits Vercel (HTTP) from GitHub Actions (scheduled writes).
- Import repo; set
DATABASE_URL(and optionalCRON_SECRET, Stytch, Blob, Google keys). - No Vercel crons for scrapes —
vercel.jsonis empty.
See .github/workflows/README.md for the full index.
| Workflow | Schedule (UTC) | Command |
|---|---|---|
| ingestion-scrape.yml | Daily 12:00 | ingestion:scrape (matrix) |
| ingestion-discover.yml | Sun 08:00 | ingestion:discover |
| ingestion-backfill.yml | Every 6h | ingestion:backfill |
| ingestion-taxonomy-publish.yml | Push config/taxonomy.json → main |
ingestion:taxonomy-publish |
| ingestion-cleanup.yml | Sun 09:00 | ingestion:cleanup |
Scrape secrets: DATABASE_URL, OPENAI_API_KEY; optional OPENAI_BASE_URL, OPENAI_EXTRACT_MODEL, TICKETMASTER_API_KEY, FUNNYBONE_SCRAPE_COOKIE.
Discover secrets: add GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN.
Backfill secrets: DATABASE_URL, OPENAI_API_KEY (for classify jobs).
workflow_dispatchon scrape workflow with optionalsourceslug.POST /api/cron/scrapewithAuthorization: Bearer $CRON_SECRET(202 + backgroundafter).
ARCHITECTURE.md Ingestion, taxonomy, pipelines (this doc)
config/taxonomy.json Taxonomy draft (publish to DB)
proxy.ts Guest profile cookie bootstrap
src/
app/ Next.js routes, UI, metrics, cron API
ingestion/ Taxonomy, backfill queue, pipeline entrypoints, adapters
taxonomy/ Snapshot, publish, diff, sanitize, validate
queue/ enqueue + process backfill jobs
pipelines/ events/scrape, places/discover|refresh, maintenance
adapters/ resolveAdapter (tribe, jsonld, custom, …)
kernel/ classification fingerprint
lib/
pipeline/ Scrape, canonicalize, editorial orchestration
scrapers/ Per-source adapters
extract/ OpenAI extraction + editorial
db/ Prisma client + queries (+ queriesTaxonomy)
places/ Google Places + discovery helpers
prisma/schema.prisma baywire schema
.github/workflows/ Scheduled ingestion — see .github/workflows/README.md
.github/actions/ Composite steps (install, playwright-chromium)
scripts/ CLI entrypoints — see scripts/README.md
ingestion/ scrape, discover, refresh, backfill, taxonomy-publish
maintenance/ cleanup-expired
ci/ scrape-matrix (GHA)
ops/ blob utilities
_lib/ shared runCli + arg helpers
- Per-host pacing ~1 req / 1.1s; extraction concurrency 4.
- Structured-first adapters avoid LLM calls when JSON-LD/ICS/API succeeds.
- Content-hash skips re-extraction when upstream payload unchanged.
- Classification fingerprint skips editorial only when content + taxonomy version + prompt revision match — bumping taxonomy version triggers re-classify via backfill (not a full re-scrape).
- Reduced HTML capped at 16k chars before
gpt-4.1-mini. - Read helpers use Accelerate
cacheStrategywhere noted in query modules.
This project respects each source's robots.txt and only fetches public listing pages. Event cards link to originals; the footer lists enabled sources from the database. For removal requests, disable the adapter or open an issue.