Crate Web v0.1.0: Full workspace with AI research, OpenUI, and team features#2
Crate Web v0.1.0: Full workspace with AI research, OpenUI, and team features#2tmoody1973 wants to merge 510 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
End-to-end tour generation flow per v1-scope.md. Turns the helpers from chunks 1-3a into a working pipeline: prompt → classify → embed → wiki memory → Perplexity → verify citations → arc + moderate + redact → persist. Real-time phase streaming via tourStatus table. New files: - convex/recommend/voyageEmbed.ts: Voyage-3 embedding wrapper. 1024-dim enforcement, named errors (Timeout, RateLimit, Unavailable, Dimension, Empty), fetch-based (no SDK dep). Retries on 5xx/rate-limit, does NOT retry on dimension mismatch (config problem). - convex/recommend/slug.ts: artist-name slug + random-hash composition per v1-scope.md Key Decision #2. Preserves lowercase artist names (billy woods), strips punctuation (clipping., $uicideboy$), falls back to "tour" prefix for unslug-able names. 4-char hash default; caller retries with 8-char on collision. - convex/recommend/mutations.ts: state-transition mutations for the tour lifecycle. createInitialTour, writeTourStatus, setPromptEmbedding, setIntentClassification, markVague, finalizeTour, markFailed, logTourEvent. Plus public queries getTourBySlug (zero-login /r/[slug] read) and getTourStatus (useQuery subscription for phase streaming). - convex/recommend/wikiMemory.ts: interface-only for now. Returns empty keep/pass arrays until the keep/pass/save UI ships (chunk 6) and populates wikiPages.tourHistory. Main action already wires this correctly; schema extension + backfill lands with the UI. - convex/recommend/index.ts: the core action pair. * generateTour (public): auth + rate-limit + createInitialTour + schedule runGenerationFlow. Returns { tourId, slug } fast so the client can navigate to the loading page. * runGenerationFlow (internal): orchestrates all 8 phases with tourStatus writes, 45s timeout race per Issue 2.5, phase-duration instrumentation, PII-hashed tourEvents log. Every helper failure is caught and falls back per the Section 2 error map: classifier → default mood_theme voyage → skip cache (proceed with empty embedding) arc → deterministic code-based fallback moderation → fail-closed (stay private, cron retries) redaction → truncate-to-50-chars fallback Parallel execution of arc + moderation + redaction in Phase 6 saves ~1.2s p95 per the eng review Section 7 budget. Vague short-circuit: if classifier returns intent_type=vague, tour is marked pending and the action returns. UI chunk (5-6) shows the 4-chip clarifying card. Tests: 194/194 green (+41 new): - voyageEmbed.test.ts: 10 tests — happy path, dim mismatch, empty vector, empty text, rate-limit, 5xx retry, dimension-mismatch no-retry - slug.test.ts: 17 tests — artist name edge cases (billy woods, clipping., $uicideboy$, MGMT, !!!), hash properties, collision probability - recommend-mutations.test.ts: 14 tests via convex-test — full lifecycle: createInitialTour → writeTourStatus → finalizeTour (approved + flagged paths) → getTourBySlug (public + private + unknown) → getTourStatus (latest row selection) → logTourEvent + markFailed + markVague + setPromptEmbedding + setIntentClassification TypeScript clean. Convex codegen succeeds. NOT in this chunk: - Vercel /api/recommend/generate proxy route (chunk 4) - Client phase-streaming hook + loading UI component (chunk 4) - Public pages /recommend, /r, /r/[slug] (chunk 5) - Tour artifact, chips, keep/pass/save buttons (chunk 6) - YouTube play + Auth0 seeds (chunk 7) - Admin UI + TTL cleanup crons + moderation retry cron (chunk 8) - E2E tests (chunk 9) Flagged for PR review: `wikiMemory.getWikiMemoryForIntent` returns empty arrays. Landing together with the keep/pass UI in chunk 6, NOT as a separate migration. Explicit TODO in the file. Requires VOYAGE_API_KEY in Vercel env vars before real /recommend generation works in prod. Tests mock Voyage so CI runs without the key. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hunk 4)
Closes the gap between chunk 3b's Convex action and the browser. Two
small additions:
src/app/api/recommend/generate/route.ts (~100 LOC)
Thin Vercel proxy per v1-scope.md Issue 1.1 architecture lock.
Reads Clerk auth, gets the Convex JWT via getToken({template: "convex"})
(template already configured in convex/auth.config.ts), calls the
generateTour action via ConvexHttpClient.setAuth. Returns { tourId, slug }
so the client can navigate to a loading page and subscribe to tourStatus.
Maps Convex action errors to specific HTTP codes:
- "Daily tour limit reached" → 429
- "User not found" → 404
- "Not authenticated" → 401
- everything else → 500 with friendly message
src/lib/recommend-hooks.ts (~100 LOC)
Two React hooks:
- useTourStatus(tourId): subscribes to the latest tourStatus row via
Convex useQuery. Re-renders automatically as the action writes
phase updates (real-time streaming UX per CEO review Issue 1.2).
Returns { phase, progress, detail, isComplete }. isComplete is
derived from terminal phase names (done | done_vague | failed |
timed_out | flagged).
- useTourGeneration(): state machine wrapper around POST /api/recommend
/generate. Returns { state: "idle"|"submitting"|"submitted"|"error",
tourId, slug, error }. Component (chunk 5-6) consumes this.
No new tests in this chunk — both files are thin glue. Route handler
logic is integration-testable via chunk 9 E2E, and the React hook
requires @testing-library/react which we'll install for component tests
in chunks 5-6.
TypeScript clean. 194/194 tests still green (no regressions).
After this chunk, the backend is fully callable from the browser:
POST /api/recommend/generate { prompt } → { tourId, slug }
Client subscribes via useTourStatus(tourId) → sees live phase updates
What's missing: the UI to render the prompt box, loading screen, and tour
artifact. That's chunks 5-6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- /recommend: prompt entry with inline loading state, subscribes to tourStatus via Convex for real-time phase updates, redirects to /r/[slug] on terminal phase. - /r: library index of recent public tours. - /r/[slug]: zero-login public tour view with arc-ordered artists, cited quotes, and refine CTA. - opengraph-image.tsx: Satori OG card for shareable tour links. - mutations.ts: listRecentPublicTours + getMyTourById queries. Design locked: #0a0a0a + #e8b86a, Bebas Neue + Space Grotesk + Georgia italic. Clerk v7 compatibility: useAuth pattern (no SignedIn/SignedOut exports in this version).
- schema: add tourSignals table (userId, tourId, artistPosition, signal) with by_user_tour + by_tour indexes. - mutations: recordSignal (auth + inline rate-limit + atomic count patch), clearSignal (reversible), recordShare (anonymous best-effort counter), getMySignalsForTour query. - TourArtifact client component with per-artist keep/pass/save buttons, optimistic UI, useTransition for pending state, inline error fallback, native share sheet + clipboard fallback, per-artist Listen↗ deep-link. - /r/[slug] page now delegates to TourArtifact, stays SSR for SEO. - 8 new tests covering signal lifecycle, atomic count deltas, auth rejection, and share counter. 22/22 recommend-mutation tests pass.
7a — YouTube inline play: - Each artist stop gets a ▶ PLAY button that expands a lazy-mounted youtube-nocookie.com iframe. One player at a time (state lifted to TourArtifact). Uses the stored youtubeTrackId when present, falls back to a search-embed keyed on artist + album. - Removes the external "LISTEN ↗" link in favor of inline play. 7b — Auth0 Spotify seeds: - /api/recommend/generate reads the auth0_user_id_spotify cookie and (best-effort, 2.5s timeout) fetches the caller's top 5 Spotify artists via Auth0 Token Vault. Never blocks tour start on a Spotify hiccup — any failure silently yields an empty seed list. - generateTour + runGenerationFlow accept optional spotifySeeds and thread them through WorkCtx to recommendFromPerplexity, which injects a "context only, NOT a constraint" paragraph into every intent-aware prompt builder. - 2 new Perplexity prompt-shape tests. 204/204 pass.
Admin moderation:
- convex/recommend/admin.ts: listFlaggedTours, listPendingReports,
setTourVisibility (approve|block with audit-tagged moderationCategories),
resolveReport. Admin gating via ADMIN_EMAILS env var + users.email,
checked in a shared requireAdmin() helper. Non-admins get "Forbidden".
- /admin/recommend page: Clerk-gated shell + client shell that renders
flagged tours and pending reports with inline Approve/Block/Dismiss/Uphold.
- Uphold action chains setTourVisibility("block") + resolveReport in one
transition so the tour is taken down atomically with the outcome write.
User reports:
- reportTour mutation in recommend/mutations.ts. Auth-required,
inline-rate-limited to 5/user/day, reason trimmed + capped at 500 chars
server-side, rejects <3-char reasons.
- Report button + dialog in TourArtifact action bar. Signed-out state,
submitted state, and error state all rendered explicitly.
TTL cleanup crons:
- convex/crons.ts registers three interval crons wired to internal
mutations in convex/recommend/cleanup.ts: pruneTourStatus (15min cadence,
1h retention), pruneTourEvents (6h cadence, 90d retention),
pruneCitationCache (1h cadence, 24h retention). Each sweep deletes a
bounded 200-row batch so the mutation never blows out.
Tests: 7 new cases — report lifecycle, admin approve audit tag, non-admin
rejection, tourStatus pruning, citationCache pruning. 211/211 pass.
Observability (PostHog events): - src/lib/recommend-analytics.ts: shared RECOMMEND_EVENTS constants + trackRecommendClient (posthog-js) and trackRecommendServer (posthog-node) helpers. Centralized names keep client, server, and Convex emissions consistent without drift. - Server event recommend_tour_started fires from /api/recommend/generate on successful action return (fire-and-forget, never blocks the response). - Convex runGenerationFlow posts recommend_tour_completed to PostHog's HTTP /capture endpoint at the end of the pipeline — lets dashboards see async outcomes even though the Vercel route returned ~200ms after start. Uses AbortController with a 2s ceiling. Failure is silent. - Client events in TourArtifact: recommend_tour_viewed (effect), recommend_signal_recorded (after mutation resolves), recommend_tour_shared (method: native_share | clipboard), recommend_tour_reported (after successful submit). - Client event in /recommend: recommend_tour_started_attempt on form submit. Smoke tests: - src/app/api/recommend/__tests__/generate-route.test.ts: 7 route-handler tests covering auth, body validation, success forwarding, and the 429/401/500 error mappings. Mocks Clerk auth, ConvexHttpClient, Auth0 Token Vault, and PostHog — no network touched. Eval suite: - evals/recommend.eval.ts: opt-in (skipped without PERPLEXITY_API_KEY, not picked up by default vitest pattern). 3 golden prompts covering mood_theme, era_genre, artist_similar — assert min pick count, min cited count, no-URL-without-publication mismatches. - Run manually: bunx vitest run evals/recommend.eval.ts --no-coverage Manual E2E: - docs/recommend-v1-e2e-checklist.md: 10-section flight check covering happy path, YouTube inline play, share, report rate-limiting, admin moderate/uphold, library, OG image, Spotify seeds, observability verification, error handling. Human-gated before merging to main. Full suite: 218/218 green. Typecheck clean.
Next.js bundled posthog-server (posthog-node + Node APIs) into the client
bundle because recommend-analytics.ts exported both client and server
helpers, and the server helper's `await import("./posthog-server")` still
pulled the transitive module graph through the client component boundary.
Split the file:
- recommend-analytics.ts: event constants + trackRecommendClient only.
Safe to import from any client component. No Node deps.
- recommend-analytics-server.ts: trackRecommendServer — imports posthog-server
directly. Only imported by route handlers.
Updated /api/recommend/generate and the route smoke test mock.
…list Quotes were empty because sonar + loose system prompt were pulling YouTube and Spotify links as "reviews." Videos showed as unavailable because search-list embeds are deprecated. Inline iframes also fought the rest of the app — playback, queue, bar controls live in a shared global player. Perplexity sourcing: - Upgrade model to sonar-pro (stronger citation behavior). - Tighten SYSTEM_PROMPT: explicit publication allow-list, explicit deny (YouTube/Spotify/Wikipedia/Genius/Discogs), ≥6 of 8-12 picks must have quotes, quote_text must be verbatim. - Wire Perplexity's native `search_domain_filter` with a 22-domain allow- list of music publications (Pitchfork, Quietus, Bandcamp Daily, NPR, RA, FACT, Stereogum, …). Belt-and-suspenders — if the model ignores the prompt, the API strips the citations before we see the response. - Plumb `searchDomainFilter` through callPerplexity. YouTube resolution: - convex/recommend/youtubeResolve.ts — YouTube Data API v3 search.list wrapper, 2.5s timeout, silent fallback on any failure (quota, 403, timeout, network). Cost ~100 units per artist; well under daily free quota for v1 traffic. - Runs in parallel with citation verification inside phase 5 — no added wall time. Populates artifactsRecommend.artists[].youtubeTrackId. Global player on public tour pages: - TourPlayerShell wraps /r/[slug] with PlayerProvider + PlayerBar + YouTubePlayer so playback routes through the shared Crate audio system (same as /w workspace and /cuts/[shareId]). - PlayerBar renders null when no track is playing, so anonymous visitors see nothing until they tap PLAY. - TourArtifact kills the inline iframe entirely. Each artist's PLAY button queues the rest of the tour from that arc position onward via usePlayer().play + addToQueue. A top-level ▶ PLAY TOUR button queues from position 0. Tracks auto-advance through the arc when each ends. - Artists without a youtubeTrackId keep the LISTEN↗ link-out. Save as Crate playlist: - saveTourAsPlaylist mutation creates a playlists row + playlistTracks rows (one per artist with a videoId). Named after tour.promptRedacted. Auth required; anonymous users see SIGN IN button instead. - SaveAsPlaylistButton in action bar. Success state links to /w to view. Other fixes in this session: - convex/auth.config.ts: add known-orca-97.clerk.accounts.dev (dev Clerk instance) as a second OIDC provider so local dev JWTs validate. - /recommend page: useEnsureUserRow hook auto-provisions the Convex users row on first mount (was throwing "User not found" for accounts that signed up before the Clerk webhook fired). - /recommend page: redirect to the FINAL tour slug (regenerated after Perplexity) rather than the provisional one the POST returns. Full suite: 218/218 green. Typecheck clean.
…results Previous approach was over-engineered: 4 allowlist pools routed by intent classifier, plus jazz-keyword detection, plus a publication-to-host map. Still produced fake citations because the model was asked to generate quote_url inline — and it hallucinated URLs on real-looking publications. Switch to denylist mode (Perplexity supports `-` prefix). Block 16 known low-signal domains (YouTube, Spotify, Wikipedia, Genius, Discogs, Last.fm, all major social, Medium, Amazon); let Perplexity pull from anywhere else. Natural genre coverage — jazz goes to JazzTimes/AllMusic/NPR, metal to Revolver/Invisible Oranges, regional to local press — without an intent pool. Match model picks to REAL URLs from Perplexity's `search_results` field, not from the hallucinated `quote_url`. perplexity-core now exports `PerplexitySearchResult[]` with per-source title/snippet/date. The matcher ranks candidates: artist-mention + publication-host > artist-mention > publication-host > fallback citations[]. No match = no quote, even if model supplied text. Strict-drop policy in index.ts phase 7: only attach a quote to an artist when `verified: true`. Previously we kept unverified quotes with a missing badge, which still shipped fake provenance. Also in this session: - convex/auth.config.ts: add known-orca-97.clerk.accounts.dev (dev Clerk) - youtubeResolve.ts: real-video-id lookup via YouTube Data API v3 during phase 5, runs parallel with citationVerify so no added wall time - youtube-player.tsx: defensive try/catch around YT.destroy() — iframe was already detached on React Strict Mode double-invoke, throwing NotFoundError: removeChild - perplexityRecommend: diag action at convex/recommend/diag.ts for debugging (delete before merge to main) Tests: 13/13 perplexity tests pass. Full /recommend suite green.
…d output Rebuild the recommend tour citation pipeline so every per-pick URL actually points at a source about that artist — driving real click- through to music publishers instead of stapling the tour's top citation onto unrelated picks. Key changes: - Drop the ?? primaryCitationUrl fallback in the per-pick quote build. Quotes now attach only when matchQuoteToSnippet finds a real source. - Tier 2 matcher accepts a match when artist name appears in the search_result title or URL path (same trust model as /i/ receipts). - Migrate domain filter from best-effort denylist to strictly-enforced allowlist of ~15 music-criticism publications. Per Perplexity docs, allowlist is enforced; denylist leaked streaming/playlist junk on mood/activity queries. - Adopt response_format: json_schema on sonar-pro calls, eliminating the PerplexityMalformedResponseError path that was killing emotional and other intent tours. - Rewrite per-intent user prompts per the Perplexity prompt guide: no "Search for:" prefix, criticism-specific lexical tokens, explicit fallback clause, MUSIC anchor on vague/emotional/activity to prevent drift into visual art, art therapy, or travel content. - Add TourSources UI (per-source cards with publication badge, snippet, artistsMentioned tags) rendered from search_results; falls back to the flat citations list for older tours. - Add one-off stripUnverifiedQuotes internalMutation to clean tours that had fallback URLs from the old code. - Add compareRetrieval + regenerateBySlug test actions for A/B iterating on citation host distribution. Validated on all 8 intent types. Regen of the Bukem-failing jazz tour now attaches real DownBeat/AllMusic/Bandcamp URLs to 8 of 10 picks instead of stapling a single dmy.co URL across every pick.
Adds a pre-retrieval step that decomposes the user's StructuredQuery into 3-5 specific music-criticism queries, hits Perplexity's Search API in parallel, and merges the curated results into sonar-pro's own search_results pool. The downstream per-pick matcher then has a richer source set to attach honest citations from. Target: fix thin retrieval on emotional/activity/mood intents where sonar-pro's single-query search returns only 2-5 hits because the search classifier misreads "music for processing grief" as a therapy query or "music for a long night drive" as a travel query. Changes: - src/lib/perplexity-search.ts (new) — direct-fetch wrapper for the Search API's POST /search endpoint, with allowlist domain filter, max_tokens_per_page, and timeout handling. - convex/recommend/queryDecompose.ts (new) — rule-based templates per intent that mix content types (album reviews, artist interviews, feature articles, critic recommendations). Interviews matter especially on emotional/mood intents because the artist's own framing is higher-signal than third-party reviews. - convex/recommend/perplexityRecommend.ts — fire decomposed Search queries AFTER the sonar-pro call, dedupe by URL, merge into searchResults so the matcher + TourSources UI both benefit. - Tests updated with URL-dispatched fetch mock that lets Search API calls fall through to an empty-results fallback while sonar-pro responses stay queued via mockResolvedValueOnce. Retrieval counts validated: emotional "processing grief": 2 → 5 (jazztimes + quietus + stereogum + allmusic/theme) activity "night drive": 5 → 9 (allmusic/theme + quietus + stereogum + bandcamp) mood "jazz winter coffee": 10 → 14 (broader mix; downbeat density lower) Jazz regen now produces 11 picks with 4 honest citations on real DownBeat URLs (Maria Schneider, Ambrose Akinmusire, Branford Marsalis, Keith Jarrett via the Branford piece). Picks lacking source coverage correctly render without a quote rather than with a fabricated one.
Adds imagery to recommend tour cards. Two surfaces:
1. Album cover art per pick (iTunes Search API):
- New `itunesArtwork.ts` helper with a two-step lookup. Tries
"artist album" first; falls back to the artist's most recent
album cover when the specific album doesn't match iTunes's
catalog (common for obscure small-label releases).
- 600x600 URLs attached server-side in the tour generation
pipeline, parallel with YouTube resolution. 3s per-request
timeout; silent failure since artwork is decorative.
- Tested: 10/10 picks land cover art on a jazz tour that
previously got 0 (artist-fallback catches the obscure picks).
2. Source card hero images (Perplexity return_images):
- `callPerplexity` now accepts `returnImages` and parses the
`images[]` response field into typed PerplexityImage records.
- Recommend pipeline sets `return_images: true`, indexes images
by origin_url, and attaches heroImageUrl to source cards on
URL match. Best-effort — Perplexity's image pool and
allowlist-filtered searchResults pool are disjoint for
mood-driven tour queries, so hit rate is low. Plumbing stays
in place for queries that DO get image matches (confirmed
working for album-specific probe queries).
UI:
- tour-artifact.tsx renders a 64x64 cover thumb next to each
pick card, above the quote blockquote. Lazy-loaded.
- ReviewSourceCard renders a hero thumb on the left of each
source entry when heroImageUrl is set.
Schema adds `artworkUrl` on artist entries and `heroImageUrl` on
source entries. Both optional. Existing tours migrate by being
re-generated; no schema migration needed.
Also adds `probeImages` diag action to promptTest for directly
inspecting Perplexity's image retrieval.
Users can now type `/recommend jazz for winter morning coffee` in chat to kick off tour generation without leaving the conversation. Implementation: server-side intercept in the chat route before the LLM routing. Extracts the prompt, gets a Convex-scoped Clerk JWT, calls the existing generateTour action, and streams back a minimal CrateEvent sequence — answer_start + an answer_token with a markdown link to the /r/[slug] tour page + done. The dedicated tour page keeps rendering in real time as picks, covers, quotes, and sources arrive. Why bypass the agentic loop: the tour generation pipeline is a 30s+ multi-phase action (Perplexity multi-query + sonar-pro + matcher + iTunes artwork + arc ordering + moderation + Convex persistence) that lives in a Convex action. Having the LLM call it as a tool would force the tool call, tool result wait, and then reconstruct a compressed artifact in chat — losing the full-fidelity rendering the /r/ page already does. The link-back pattern preserves publisher attribution (clickable source cards) and lets the chat thread stay focused on conversation. Registers the command in both doc surfaces: - Public commands marketing page (`commands.tsx`) - In-app help drawer (`commands-reference.tsx`) Chat-rate-limit still applies (per-minute). Tour rate-limit applies via generateTour's internal check (20/day free tier). Errors from the Convex action surface as `error` CrateEvents the chat panel already renders.
feat(recommend): v1 tour builder — honest citations + /recommend command
…, allowlist breadth Three coordinated fixes to address AllMusic/Bandcamp dominance in citations on tour pages. Problem observed: AllMusic has a /artist/<slug> page for virtually every musician. That URL pattern always contains the artist's name slug, so the matcher's Tier 2 (artist-in-URL) matched every time — filling tours with allmusic.com/artist/... citations whose prose wasn't from the page. Same problem with <artist>.bandcamp.com subdomains (self-hosted artist marketing pages, not criticism). Changes: - isAggregatorBioUrl(): new helper in convex/recommend/index.ts that recognizes allmusic.com/artist/<slug> and bare <artist>.bandcamp.com as aggregator-bio URLs. Still allows allmusic.com/album/<slug> (real Thom Jurek / Richard Ginell album reviews) and daily.bandcamp.com (Bandcamp Daily editorial). - matchQuoteToSnippet() filters the searchResults pool through the aggregator check at the TOP, before any tier runs. Previously only Tier 2 guarded against aggregator URLs, but Tier 1 could still match if a quote prefix coincidentally appeared in an AllMusic bio snippet. - Per-publication cap: MAX_CITATIONS_PER_PUBLICATION = 3. After all picks are matched, iterate in arcPosition order and drop the quote on any pick that would push a host's count above the cap. Prevents monoculture even if retrieval skews toward one publication. - Allowlist expansion: filled remaining cap slots (15 → 20) with long-tail critic sites that appeared in earlier retrieval tests and got excluded when we first trimmed: jerryjazzmusician.com, brooklynrail.org, popmatters.com, clashmusic.com, nme.com. Verified on a regen of the jazz tour: host distribution went from "4 AllMusic + 2 DownBeat + Bandcamp bio mix" to "2 DownBeat, 0 AllMusic, 0 Bandcamp bio pages." Fewer quotes per tour, but every remaining quote points at actual criticism.
Replaces the bulk "one sonar-pro call generates all 10 picks AND all 10
quotes" design with a two-phase per-pick flow that mirrors /i/
Influence Receipts — the architecture we already have shipping with
citations right every time.
Why: /i/ works because it queries Perplexity about ONE artist at a time.
Every retrieved document is on-topic; when sonar-pro writes a pullQuote
it's extracting or paraphrasing from articles that ARE about that
artist. /r/ was doing the opposite — one bulk call for all 10 picks,
retrieval diffuse, quote prose synthesized from training memory and
papered over with a post-hoc matcher. That mismatch is why we kept
needing matcher hardening, allowlist swaps, response_format tightening,
and aggregator-bio skips — all band-aids on the wrong-shape pipeline.
Phase A — convex/recommend/pickSelector.ts (new):
sonar-pro call that returns ONLY artist names + albums + years + optional
relationship/weight. No quote_text, quote_publication, or quote_url
fields in the response schema. Model still uses retrieval to ground
the SELECTION (it picks artists it has evidence for), but prose
generation is deferred to Phase B.
Phase B — convex/recommend/groundedQuote.ts (new):
Per-pick parallel call for each selected pick:
1. Perplexity Search API with multi-query scoped to THAT artist +
album ("<artist> <album> review", "<artist> interview", etc.)
against the music-publication allowlist.
2. Top 3 eligible snippets (aggregator-bio URLs pre-filtered) passed
to Claude Haiku with a locked prompt: "pick ONE source by index
and write 2 sentences explaining why the artist fits the tour,
drawing ONLY from that snippet." Model returns { citedIndex, why }.
3. Index maps back to the actual URL. Prose and URL are tightly
coupled by construction — URL chosen BEFORE the prose is written.
Returns null when retrieval is thin or Claude can't support the pick
from the retrieved snippets. Caller renders quote-less honestly.
Orchestrator — convex/recommend/index.ts:
runWork Phases 4 and 5 rewritten:
- Phase 4 (was "perplexity"): selectPicks() — one call, names only.
- Phase 5 (was "verify" + YouTube): per-pick Promise.all of
groundedQuoteForPick + resolveYouTubeVideoId. verifyCitation is
gone — grounded quotes are verified-by-construction (Claude
chose the URL and drew prose from its snippet).
Phase 7 artist build reads from enriched[] and attaches groundedQuote
directly to artist.quote. The old matcher path (matchQuoteToSnippet +
per-publication cap) is unused for this flow — left in place for
legacy tours that regenerate.
Sources section now rebuilds from grounded quotes: every card is one
the pipeline actually drew prose from, deduped on URL, with multiple
artists listed when picks share a source.
Tradeoffs:
- 10x Perplexity Search API volume (per-pick calls). Same vendor, same
key, so the budget impact is a per-search-request charge × 10.
- +1 Claude Haiku call per pick. Parallel, so added wall-clock is
bottlenecked by the slowest pick (~2-4s).
- Fewer quotes per tour on thin-retrieval picks — intentional. The
only way to get universal citation coverage under this architecture
is to have real retrieval for every pick. Honest floor over
fabricated ceiling.
Verified: regen of the jazz/winter-contemplation tour returns 2/6
grounded picks (Maria Schneider → JazzTimes live review, Ambrose
Akinmusire → DownBeat Owl Song review). Quote prose paraphrases the
retrieved snippet content; URL points at the exact article that
prose came from. The 4 picks without grounding render quote-less
instead of being stapled with synthesized prose.
feat(recommend): per-pick grounded architecture + matcher hardening
…rompts Sonar-pro was returning 1-3 picks when the prompt's exact theme (e.g. "danceable songs for climate grief") had thin direct critical coverage, because the system prompt rule "Target 8-12 picks" gave the model no escape hatch when it couldn't verify the count for the exact theme. Changes the rule to explicitly allow padding to 8-12 with well-documented adjacent-genre/mood artists, and calls out that returning 1-3 picks is a failure mode. The Phase B grounded-quote step is still the honesty layer — adjacent picks without themed coverage just render quote-less downstream, not stamped with fabricated citations. Observed: prod tour "danceable songs for climate grief" returned 1 pick (Jayda G, correctly grounded to crackmagazine.net). After this fix the same prompt should return 8-12 picks with per-pick grounding attempted on all of them.
User feedback: 12-pick tours have too many quote-less filler cards next to the grounded ones. A 6-8 pick tour reads more like a curated mixtape, less like a Spotify auto-playlist. Three-way change: - pickSelector system prompt target 6-8, cap at 8, failure-mode language unchanged (still demand >= 6 minimum). - All 8 intent-specific user prompts now say "Find 6-8 musical artists" instead of "Find 8-12". - picks.slice(0, 8) replaces slice(0, 12) in both pickSelector and the runWork orchestrator. isSparse threshold shifts from <8 to <6. Side wins: - Per-pick cost drops ~40% (Phase B scales linearly with pick count). - Grounding rate as a percentage should climb — model leads harder with its best-documented picks rather than stretching to 12. - Tour wall-clock ~same (parallel, bottlenecked by slowest pick).
…moderation API failures Three intertwined bugs sent /recommend users to a 404 when the moderation classifier failed transiently: 1. /recommend redirected to /r/[slug] for both phase=done and phase=flagged, but /r/[slug] gates on isPublic. Flagged tours have isPublic=false, so the destination 404'd. Now only redirect on phase=done. Flagged/timed-out tours stay on the LoadingPanel STOPPED state which already exists. 2. The Haiku moderation classifier writes "unknown-moderation-failure" on transient API errors. The finalizer was tagging these as moderationStatus=flagged, indistinguishable from real content flags. Now API failures write moderationStatus=timed_out, which surfaces in the existing admin queue (listFlaggedTours already includes timed_out) and is recoverable. Real content flags still write "flagged". 3. writeStatus always wrote phase=done regardless of moderation outcome, so the client's phase=flagged check was dead code. Now the final phase reflects the actual result so the LoadingPanel and redirect logic agree. Plus: /recommend added to the chat slash command menu, navigates to the recommend page with the prompt prefilled via ?prompt= query param.
The per-pick grounded architecture (commit d3f3598) saturates the same wall-clock window that moderation fires in, pushing structured Haiku calls past the 5s budget on benign prompts like "70s spiritual jazz." Result: clean tours got tagged moderationStatus=timed_out and stayed private, even though content classification would have passed. Evidence: PostHog recommend_tour_completed events show ~17-18s total wallclock for failed runs vs ~12-13s for the one success in the same window. Errors array is sliced to 5; the moderation timeout was falling off the end of the slice, hiding the real failure. 15s is enough headroom for a structured Haiku call under load without masking genuine outages. Retry policy unchanged — HaikuTimeoutError still NOT retried (if 15s isn't enough, two of them won't be either).
Synthesis attempts were failing with "Unexpected non-whitespace character
after JSON at position N" — Haiku returns valid JSON followed by trailing
explanatory prose, which JSON.parse rejects.
Fast-path tries pure JSON.parse on the fence-stripped text. Fallback
slices the outermost { ... } and parses that. Mirrors the brace-slice
pattern already proven in convex/recommend/haikuStructured.ts:
parseJSONFromResponse, but inlined here to avoid forcing convex/wiki.ts
into the Node runtime (it currently runs in V8).
19 of 23 user records had no usernameSlug because they predate the slugify-on-upsert path. /wiki/[username]/[slug] pages 404'd for everyone because getBySlug queries the by_username_slug index and got null on the owner lookup, regardless of page visibility. backfillUsernameSlugs is idempotent, ordered by createdAt ascending (oldest user wins the canonical slug), and dedupes collisions with -2 / -3 suffixes. Run once via: bunx convex run users:backfillUsernameSlugs
After fixing the Haiku JSON-trailing-prose bug, 56 of 96 existing wiki pages stayed stuck in the unsynthesized state because the original synthesis attempt failed silently (caught, logged, never retried). resynthesizeStuckPages re-schedules every page where no section has lastSynthesizedAt. Uses 2-second stagger to avoid Anthropic rate limits. Skips already-synthesized pages and archived pages. Run via: bunx convex run wiki:resynthesizeStuckPages
… drop double cast Three Boy-Scout cleanups on top of the prior session's fixes — no behavior change, just shape. 1. resolveModerationOutcome() in convex/recommend/index.ts replaces three replicated ternaries (moderationStatus, finalPhase, finalDetail) with one outcome resolver. Adding a 4th moderation outcome (e.g., manual-review) is now one branch instead of three edits across two files. MODERATION_FAILURE_CATEGORY constant removes the duplicated "unknown-moderation-failure" magic string. 2. LIFECYCLE_BY_MODERATION_STATUS lookup table in mutations.ts replaces the matching ternary chain in finalizeTour. Schema drift caught by `satisfies` instead of falling through to "completed". 3. HAIKU_TIMEOUT_MS, RESYNTHESIZE_STAGGER_MS, PROMPT_MAX_LENGTH — named constants for the values introduced last session. Notably PROMPT_MAX_LENGTH now lives in one place; the same 400 was repeated three times in recommend/page.tsx (initial slice, onChange slice, counter display). 4. callHaikuSynthesis: dropped the double `(parsed as Record<string, unknown>).sections` cast. parseLooseJSON already returns the right shape; let TypeScript see it. 5. Comment rot: "56 stuck pages × 2s = ~2 minute total elapsed" was wrong by the next session (actual: 101 pages, ~3.3 min). Replaced with the constant + a generic explanation. Typecheck clean, build clean, no new tests yet (testing is the outstanding gap from the clean-code review — separate task).
Captured codebase shape and lint state in docs/clean-code-baseline.md
as the input for the systematic clean-code review plan. Then applied
the safe mechanical fixes — no production logic changes.
ESLint config:
- Exclude docs/** from lint scope (planning docs, not production code).
Eliminates 23 false positives in docs/crate-recommend-feature/*.
- Allow underscore-prefixed unused args/vars/caught-errors (matches
TypeScript's standard "intentionally unused" convention). Eliminates
~10 false-positive no-unused-vars warnings.
- Disable react-hooks/rules-of-hooks for src/lib/openui/components.tsx
with explanatory comment. The defineComponent({ component: ({props})
=> ... }) factory pattern declares real React components inside an
object literal, but the rule cannot see them and flags every useState
call. 24 false positives gone.
Source cleanups:
- Removed unused ClerkProvider import in convex-provider.tsx.
- Escaped 5 unescaped quotes in commands-reference.tsx and
video-influence-chain.tsx (' → ', " → “/”).
- Stripped 3 now-redundant eslint-disable-next-line comments left
behind by eslint --fix in components.tsx.
Result: 168 → 97 lint problems (−71, −42%). Typecheck clean. Build
passes. Remaining 97 are the real debt that needs human judgment —
input to Phase 1 hot-path reviews (no-explicit-any, no-img-element,
genuinely unused symbols).
…ion, qualified-view dwell
Ships the observation rig from the Receipt Truth gated validation sprint
(office-hours design doc 20260428). No new feature, no design changes.
All measurement.
Adds:
- src/lib/creator-id.ts: anonymous crate_creator_id cookie (random 16
chars, 1 year, SameSite=Lax). Stamped on every PostHog event so the
gate metric "unique non-Tarik views per posted receipt" can be
computed at query time by filtering Tarik's known IDs.
- src/hooks/use-active-time.ts: dwell hook that counts active foreground
ms (pauses on visibilitychange — Reddit's preview pipeline opens many
hidden tabs). Fires onQualified once at the 5s threshold.
- New PostHog events on /i/[slug]:
- receipt_generated: fired client-side when receipt.generatedAt is
within 60s. Imperfect but cheap heuristic; alternative is a schema
migration to track creator_id on the cache row.
- receipt_viewed_via_share: fired when the URL carries ?s=<token>.
PostHog can JOIN this back to the originating receipt_share_click
event to verify reshare-with-downstream-traffic.
- receipt_view_qualified: fired at 5s active dwell. THIS is the gate
metric. receipt_view (raw) still fires on mount.
- ShareButton: generates a 6-char share token, builds /i/[slug]?s=<token>
URLs (replaces the prior utm_source/utm_medium scheme), and stamps the
token on all three receipt_share_click variants.
- All existing posthog.capture calls on the receipt page now include
creator_id (receipt_view, receipt_share_click, receipt_try_another,
receipt_cta_click).
- ReceiptUI wrapped in Suspense at the page level — required for
useSearchParams in client components under Next 15.
Behavior unchanged for end users. The "Try another artist..." copy
stays put per founder direction (spec said "see another influence
chain →"; the SearchBox is already that CTA).
Decision rules locked BEFORE data per the sprint anti-criterion. Gate
will be applied on Day 4.
|
Strix is installed on this repository, but we could not run this PR security review because this workspace does not have an active plan. If you'd like to continue receiving code reviews, you can add a payment method or manage billing here. |
What changed
Complete Crate Web workspace implementation — from bare Next.js scaffold to a fully functional AI music research platform with persistent chat, dynamic UI components, team key sharing, multi-model support, and AgentMail integration.
Why
Building the web companion to Crate CLI so Radio Milwaukee team members and external users can access AI-powered music research through a browser without needing a terminal.
Changes
Core Workspace
OpenUI Dynamic Components
Multi-Model Support
ANTHROPIC_BASE_URLto OpenRouter endpoint for non-Anthropic modelsTeam Features
@domainteammatesy3v9l8q1c8s3d4n6@88nine.slack.com) or any emailBug Fixes
Testing
tsc --noEmit)next build)Notes for CodeRabbit
OpenUI Lang) — thesplitContent()parser is intentionally simple.@x402/fetchpeer dependency (installed explicitly).AGENTMAIL_API_KEYenv var fallback in/api/emailis intentional for team usage where not every user needs their own key.Related
6ed38df(initial Next.js scaffold)🤖 Generated with Claude Code