feat: enrich MP3 metadata via MusicBrainz lookup by jzstern · Pull Request #44 · jzstern/dub-rip

jzstern · 2026-01-24T02:32:43Z

Description

Enriches downloaded MP3s with album, year, genre, track number, record label, and cover art by looking up tracks on MusicBrainz. The lookup runs in parallel with the audio download, so there's no added latency. Falls back to YouTube thumbnail for artwork when no MusicBrainz match is found.

New ID3 tags added: album, year, genre, trackNumber, publisher (label), APIC (cover art)

Architecture: As soon as artist+title are parsed from the YouTube metadata, a parallel enrichMetadata() promise fires off. It queries MusicBrainz's recording search API, picks the best album release, fetches cover art from the Cover Art Archive (or YouTube thumbnail as fallback), and applies everything as ID3 tags before the file is sent to the client.

Cover art lookup iterates through all candidate release IDs across matching recordings (albums prioritized first) until artwork is found on the Cover Art Archive, maximizing the chance of getting official artwork.

Future work

Rate limiting for MusicBrainz API (1 req/sec per IP) if concurrent downloads become common
Show album artwork in the UI (currently only metadata text is displayed)
Allow users to skip enrichment or correct mismatched metadata

How to test

Navigate to the app and paste a YouTube URL for a well-known song (e.g., Queen - Bohemian Rhapsody)
Click Download

During download, album name, year, genre, and label appear below the video title
Downloaded MP3 has enriched ID3 tags (check with a tag editor or music player)
Cover art is embedded in the MP3 (official album art, not YouTube thumbnail)

Try a track where the single lacks cover art but the album has it (e.g., Aries & brakence - SLEEPWALKER)

Official album artwork is embedded (not YouTube thumbnail)

Try an obscure/non-music video (e.g., a podcast or lecture)

Download still succeeds without delay
YouTube thumbnail is embedded as artwork (no MusicBrainz match expected)

Run the test suite: bun run test

All 209 tests pass

Author checklist

This PR:

Satisfies a goal that is specific & clearly motivated
Adds value in isolation (whether user-facing or sustainability-related)
Contains a concise & easy-to-understand title + description
Adheres to SRP by default
Presents the best possible implementation to meet its goal, given constraints at hand

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Automatic music metadata enrichment during downloads: album, year, genre, track number, label, and artwork may be fetched, emitted as runtime metadata, embedded in ID3 tags, and shown in the UI near title/status; artwork falls back to thumbnails and is cleared on resets/errors.
Tests
- Added comprehensive tests covering metadata lookup, cover-art/thumbnail fallbacks, timeouts, error scenarios, release selection, primary-artist normalization, and integration with the download flow.
Chores
- Broadened test discovery, increased Node memory limit, extended token generation timeout, and added non-blocking background token refresh at startup.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Note

Adds metadata enrichment during downloads with parallel MusicBrainz lookup and artwork embedding.

New src/lib/musicbrainz.ts: search recordings, prefer Album releases, collect candidate release IDs, fetch tags/labels, get cover art; fallback to YouTube thumbnail
download-stream/+server.ts: run enrichment in parallel, emit metadata SSE event, write ID3 tags (album, year, genre, trackNumber, publisher, APIC) and embed artwork
UI (+page.svelte): display enriched metadata during download; clear on reset/error
Tests: comprehensive unit/integration coverage for lookup, cover art fallback, timeouts, errors; update vitest.config.ts to include src/**/*.{test,spec}.ts and coverage thresholds
Ops/infra: increase yt-token Node memory, extend token generation timeout, prefetch PO token at startup in non-dev, and set PR-only COBALT_API_URL in Railway workflow

^{Written by Cursor Bugbot for commit 499ad14. This will update automatically on new commits. Configure here.}

railway-app · 2026-01-24T02:32:53Z

🚅 Deployed to the dub-rip-pr-44 environment in dub-rip

Service	Status	Web	Updated (UTC)
dub-rip	✅ Success (View Logs)	Web	Jan 30, 2026 at 5:05 am

coderabbitai · 2026-01-24T02:33:05Z

📝 Walkthrough

Walkthrough

Adds MusicBrainz-backed metadata and artwork enrichment (with YouTube-thumbnail fallback), tests for enrichment and fetch logic, integrates enrichment into download-stream (SSE metadata events and ID3 tagging), expands types and frontend UI, adjusts yt-token startup/timeout, and updates CI and test discovery.

Changes

Cohort / File(s)	Summary
MusicBrainz library & types `src/lib/musicbrainz.ts`, `src/lib/types.ts`	New MusicBrainz client: `lookupTrack`, `fetchCoverArt`, `fetchThumbnailArt`, `TrackMetadata`, `CoverArtResult`; new `EnrichedMetadata` and `StreamEvent` additions. Network calls use timeouts/AbortSignal and return `null` on failure.
Download-stream integration `src/routes/api/download-stream/+server.ts`	Starts enrichment promise, emits `"metadata"` SSE events, awaits enrichment before writing ID3 tags, embeds artwork when present, and replaces direct controller.close() calls with `closeStream()` in error paths.
Frontend UI `src/routes/+page.svelte`	Adds `enrichedMetadata` state, listens for `"metadata"` events, conditionally renders album/year/genre/label, and clears metadata on start/errors.
Tests `src/lib/musicbrainz.test.ts`, `src/lib/musicbrainz-integration.test.ts`	New unit and integration tests covering lookup, release selection, tags/labels, cover-art & thumbnail fetching, timeouts/abort handling, network errors, and fallback flows (mocked `fetch`).
YT token & startup `src/lib/yt-token.ts`, `src/hooks.server.ts`, `Dockerfile.yt-token`	Increased token generation timeout (30s→60s), call to `fetchPoToken()` at startup (non-blocking), and raised Node memory limit in Dockerfile.
Test discovery & CI `vitest.config.ts`, `.github/workflows/railway-pr.yml`	Expanded Vitest globs to include `src/*/.{test,spec}.ts`; added step to override internal env var (`COBALT_API_URL`) in PR workflow.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Server as Server (download-stream)
    participant MB as MusicBrainz API
    participant CAA as Cover Art Archive
    participant YT as YouTube
    participant ID3 as ID3 Tagger

    Client->>Server: POST /download (artist, title, videoId)
    Server->>Server: start enrichmentPromise
    par Enrichment Branch
        Server->>MB: lookupTrack(artist, title)
        alt MusicBrainz match
            MB-->>Server: TrackMetadata (incl. releaseId)
            Server->>CAA: fetchCoverArt(releaseId)
            CAA-->>Server: imageBuffer + mime
        else No MB match
            MB-->>Server: null
            Server->>YT: fetchThumbnailArt(videoId)
            YT-->>Server: imageBuffer + mime
        end
    and Download Branch
        Server->>Server: start audio stream / download
    end
    Server->>Server: resolve enrichmentPromise
    Server->>Client: emit metadata event (album, year, genre, trackNumber, label, hasArtwork)
    Server->>ID3: embed metadata + artwork (if present)
    Server->>Client: stream audio response with ID3 tags

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly Related PRs

fix: use oEmbed for metadata instead of yt-dlp #24 — touches src/routes/api/download-stream/+server.ts; overlaps metadata extraction/handling and SSE emission.
fix: yt-dlp binary, Cobalt env vars, and error persistence #28 — modifies the download-stream flow and error/tagging handling; likely to conflict with enrichment placement.
feat: add Node.js-based YouTube token service #35 — changes to yt-token generation and startup; intersects with src/lib/yt-token.ts and startup token fetch.

Poem

🐰 I hopped through MusicBrainz at break of day,

Fetched covers and tags and tucked them away,
When MB is silent, YouTube lends a view,
I stitch art and metadata snug into you,
Now every rip wears a ribbon anew.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: enrich MP3 metadata via MusicBrainz lookup' directly and accurately describes the main change in the PR, which adds MusicBrainz-based metadata enrichment for MP3 files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

charliecreates

The enrichment pipeline works end-to-end, but there are behavioral mismatches between intent and implementation: lookupTrack() naively trusts recordings[0] (likely metadata mismatches in the wild), and the server always fetches a YouTube thumbnail when artist/title parsing fails, while also not emitting a metadata event in that case. The UI additionally defines an ad-hoc metadata shape instead of reusing EnrichedMetadata, increasing contract drift risk. Addressing these will improve correctness, performance, and long-term maintainability.

Additional notes (1)

Readability | src/lib/musicbrainz.ts:44-49
pickBestRelease() uses releases[0] as a fallback. If MusicBrainz returns releases in a non-deterministic order (or an order that biases toward Singles/compilations), this can degrade quality significantly.

Since you already have a preference for Albums, it’s worth adding additional tie-breakers (e.g., earliest date or presence of track number/label).

Summary of changes

Overview

This PR adds parallel MusicBrainz-based metadata enrichment for downloaded MP3s and streams the enriched fields to the UI.

Key changes

New MusicBrainz client in src/lib/musicbrainz.ts
- lookupTrack() queries MusicBrainz recording search and extracts album, year, genre, trackNumber, label, releaseId.
- fetchCoverArt() pulls release artwork from Cover Art Archive.
- fetchThumbnailArt() fetches YouTube thumbnail as artwork fallback.
Streaming contract updates in src/lib/types.ts
- Adds EnrichedMetadata and a new SSE event type: type: "metadata" with metadata payload.
Download pipeline integration in src/routes/api/download-stream/+server.ts
- Starts enrichMetadata() in parallel once artist/title are known.
- Sends a metadata SSE event and writes additional ID3 tags (including artwork/APIC) before returning the MP3.
UI updates in src/routes/+page.svelte
- Stores and displays enriched metadata during download.
Tests + Vitest config
- Adds unit/integration tests under src/lib/*.test.ts.
- Expands Vitest include globs to run tests in src/**.

charliecreates · 2026-01-24T02:34:28Z

+				let enrichmentPromise: Promise<EnrichmentResult> =
+					Promise.resolve({ metadata: null, artwork: null });
+
+				if (artist && trackTitle) {
+					enrichmentPromise = enrichMetadata(artist, trackTitle, videoId);
+				} else if (videoId) {
+					enrichmentPromise = enrichMetadata("", "", videoId);
+				}
+


enrichMetadata("", "", videoId) is effectively a no-op for MusicBrainz (since lookupTrack immediately returns null when artist/title are empty) but still always triggers the YouTube thumbnail fetch. That means for any video where parsing fails (or for non-music videos), you’ll still do an extra network call early, in parallel with the download. If the intent is “fallback artwork only when we’re confident we’re producing an MP3 and/or when enrichment is desired”, this is the wrong place to do it.

Also note: you later only send a metadata SSE event when enrichment.metadata exists, so in this branch the UI never learns hasArtwork: true even though artwork may be embedded.

Suggestion

Consider splitting the concerns:

Only perform MusicBrainz lookup when artist && trackTitle.

Perform thumbnail fallback later (or conditionally) if you actually need artwork and have a valid videoId.

If you do embed thumbnail art without MB metadata, consider still emitting a metadata event with { hasArtwork: true } so UI state matches the output file.

Example:

let enrichmentPromise = artist && trackTitle ? enrichMetadata(artist, trackTitle, videoId) : Promise.resolve({ metadata: null, artwork: null }); // ...later, before writing tags let enrichment = await enrichmentPromise; if (!enrichment.artwork) { enrichment = { ...enrichment, artwork: await fetchThumbnailArt(videoId) }; } send({ type: "metadata", metadata: { album: enrichment.metadata?.album, year: enrichment.metadata?.year, genre: enrichment.metadata?.genre, trackNumber: enrichment.metadata?.trackNumber, label: enrichment.metadata?.label, hasArtwork: enrichment.artwork !== null, }});

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.

charliecreates · 2026-01-24T02:34:28Z

+		if (!recordings || recordings.length === 0) return null;
+
+		const recording = recordings[0];
+		const releases = recording.releases;
+
+		if (!releases || releases.length === 0) return null;
+
+		const release = pickBestRelease(releases);
+


lookupTrack() always picks recordings[0] and then picks the “best release” from that recording’s releases. This can be wrong when MusicBrainz returns a first recording that is a close-but-incorrect match (covers, live versions, remasters, karaoke, etc.). In practice, this will cause metadata mismatches that are hard for users to understand.

Given the PR description says “picks the best album release”, the implementation currently only prefers album within a single recording, not best recording overall.

Suggestion

Improve the selection logic by scoring multiple recordings (you’re already limiting to 5) and then selecting the best release from the best recording. Even a simple heuristic helps a lot:

Prefer recordings that have at least one Album release.

Prefer releases with a date.

Prefer releases that include media.track.number.

Sketch:

function scoreRecording(r: MusicBrainzRecording): number { const releases = r.releases ?? []; const hasAlbum = releases.some(x => x["release-group"]?.["primary-type"] === "Album"); const hasDate = releases.some(x => x.date); const hasTrackNo = releases.some(x => x.media?.[0]?.track?.[0]?.number); return (hasAlbum ? 100 : 0) + (hasDate ? 10 : 0) + (hasTrackNo ? 5 : 0); } const bestRecording = recordings .slice() .sort((a,b) => scoreRecording(b) - scoreRecording(a))[0];

Then pick the best release from bestRecording.releases.

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.

charliecreates · 2026-01-24T02:34:28Z

+export async function lookupTrack(
+	artist: string,
+	title: string,
+	timeout: number = DEFAULT_TIMEOUT,
+): Promise<TrackMetadata | null> {
+	if (!artist || !title) return null;
+
+	const controller = new AbortController();
+	const timeoutId = setTimeout(() => controller.abort(), timeout);
+
+	try {
+		const query = `recording:"${escapeLucene(title)}" AND artist:"${escapeLucene(artist)}"`;
+		const url = `${API_BASE}?query=${encodeURIComponent(query)}&fmt=json&limit=5`;
+
+		const response = await fetch(url, {
+			signal: controller.signal,
+			headers: { "User-Agent": USER_AGENT },
+		});
+
+		if (!response.ok) return null;
+
+		const data = (await response.json()) as MusicBrainzResponse;
+		const recordings = data.recordings;
+
+		if (!recordings || recordings.length === 0) return null;
+
+		const recording = recordings[0];
+		const releases = recording.releases;
+
+		if (!releases || releases.length === 0) return null;
+
+		const release = pickBestRelease(releases);
+
+		return {
+			album: release.title ?? "",
+			year: release.date?.slice(0, 4) ?? "",
+			genre: extractGenre(recording.tags),
+			trackNumber: release.media?.[0]?.track?.[0]?.number ?? "",
+			label: release["label-info"]?.[0]?.label?.name ?? "",
+			releaseId: release.id,
+		};
+	} catch {
+		return null;
+	} finally {
+		clearTimeout(timeoutId);
+	}
+}
+
+export async function fetchCoverArt(
+	releaseId: string,
+	timeout: number = COVER_ART_TIMEOUT,
+): Promise<CoverArtResult | null> {
+	if (!releaseId) return null;
+
+	const controller = new AbortController();
+	const timeoutId = setTimeout(() => controller.abort(), timeout);
+
+	try {
+		const url = `${COVER_ART_BASE}/${encodeURIComponent(releaseId)}/front-250`;
+
+		const response = await fetch(url, {
+			signal: controller.signal,
+			headers: { "User-Agent": USER_AGENT },
+		});
+
+		if (!response.ok) return null;
+
+		const arrayBuffer = await response.arrayBuffer();
+		const mime = response.headers.get("Content-Type") ?? "image/jpeg";
+
+		return { imageBuffer: Buffer.from(arrayBuffer), mime };
+	} catch {
+		return null;
+	} finally {
+		clearTimeout(timeoutId);
+	}
+}
+
+export async function fetchThumbnailArt(
+	videoId: string,
+	timeout: number = COVER_ART_TIMEOUT,
+): Promise<CoverArtResult | null> {
+	if (!videoId) return null;
+
+	const controller = new AbortController();
+	const timeoutId = setTimeout(() => controller.abort(), timeout);
+
+	try {
+		const url = `https://i.ytimg.com/vi/${encodeURIComponent(videoId)}/hqdefault.jpg`;
+
+		const response = await fetch(url, {
+			signal: controller.signal,
+		});
+
+		if (!response.ok) return null;
+
+		const arrayBuffer = await response.arrayBuffer();
+		const mime = response.headers.get("Content-Type") ?? "image/jpeg";
+
+		return { imageBuffer: Buffer.from(arrayBuffer), mime };
+	} catch {
+		return null;
+	} finally {
+		clearTimeout(timeoutId);
+	}
+}


All fetch failures are swallowed via catch { return null; } in lookupTrack(), fetchCoverArt(), and fetchThumbnailArt(). That makes operational debugging difficult—especially since MusicBrainz rate-limits and may intermittently 503/429.

At minimum, distinguishing abort/timeouts from other failures (and optionally logging at debug level) would help you understand when enrichment is failing due to timeouts vs network vs status codes.

Suggestion

Capture and branch on error type:

If err is an AbortError, return null silently.

Otherwise, log a concise message (or Sentry.captureException) including which endpoint failed.

For non-OK responses, consider special-casing 429 to avoid repeated retries and to make it observable.

Reply with "@CharlieHelps yes please" if you’d like me to add a commit that adds minimal error classification + optional Sentry reporting without spamming logs.

charliecreates · 2026-01-24T02:34:28Z

+let enrichedMetadata = $state<{
+	album?: string;
+	year?: string;
+	genre?: string;
+	label?: string;
+} | null>(null);


The page stores enrichedMetadata as an ad-hoc local object type that doesn’t include trackNumber or hasArtwork, even though the SSE contract (EnrichedMetadata) includes them. This increases drift risk between client/server and encourages partial UI updates to silently break when the server evolves.

Suggestion

Import and reuse the shared type from $lib/types to keep the contract aligned, and store the full EnrichedMetadata | null.

<script lang="ts"> import type { EnrichedMetadata } from "$lib/types"; let enrichedMetadata = $state<EnrichedMetadata | null>(null); </script>

Then you can optionally render trackNumber / hasArtwork later without changing state shape.

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.

devin-ai-integration

Devin Review found 1 potential issue.

View issue and 4 additional flags in Devin Review.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/routes/+page.svelte (1)
210-218: Clear metadata on SSE connection errors.

On connection loss, enrichedMetadata can remain stale (Line 210 onward). Consider resetting it in the error handler.
🐛 Suggested fix
eventSource.onerror = () => {
	if (!error) {
		error = "Connection lost";
	}
	errorUrl = url;
	eventSource.close();
	loading = false;
	status = "";
+	enrichedMetadata = null;
};

🤖 Fix all issues with AI agents

In `@src/lib/musicbrainz.ts`:
- Line 3: The USER_AGENT constant currently uses the wrong repo URL; update the
USER_AGENT string (constant USER_AGENT in src/lib/musicbrainz.ts) to reference
the correct GitHub repository/maintainer (e.g., "jzstern/dub-rip") or another
accurate contact per MusicBrainz API guidelines so the User-Agent contains
correct contact info.
- Around line 56-58: The escapeLucene function currently only escapes double
quotes; update it to escape all Lucene reserved characters so queries don't
break for names like "AC/DC" or "P!nk". In the escapeLucene function, replace
the simple double-quote replacement with logic that prepends a backslash to
every Lucene special character (including + - && || ! ( ) { } [ ] ^ ~ * ? : \ /
and "), ensuring the backslash itself is escaped in the pattern; use a single
regex or iteration over a character set and return the fully escaped string from
escapeLucene so callers (e.g., any code building Lucene queries) get a safe,
escaped value.
- Line 97: The current assignment to trackNumber uses
release.media?.[0]?.track?.[0]?.number which always picks the first track;
update the logic in the function that builds the release track info (referencing
the release object and the trackNumber field) to find the track that matches the
searched recording: either request recordings by adding the inc=recordings
parameter to the MusicBrainz query and match by recording id, or iterate over
release.media[*].track[*] to find the track whose recording id equals the
searched recording id and then set trackNumber to that track.number (falling
back to empty string if none found).

In `@src/routes/api/download-stream/`+server.ts:
- Around line 219-221: The CI failed due to formatting mismatch around the
initialization of enrichmentPromise (variable enrichmentPromise:
Promise<EnrichmentResult> = Promise.resolve({ metadata: null, artwork: null }));
run the project's formatter (e.g., Prettier/format script) on
src/routes/api/download-stream/+server.ts or invoke the repository's
pre-commit/format command to reformat this block so it matches the repository
style; ensure the single-line/spacing/indentation around enrichmentPromise and
its object literal follow the project's configured formatter rules and re-run
CI.
- Around line 219-226: enrichmentPromise is eagerly resolved to empty metadata
which prevents a later enrichMetadata(artist, trackTitle, videoId) run when
artist/trackTitle are discovered via yt-dlp; change to a lazy starter: remove
the immediate Promise.resolve and introduce a startEnrichment() helper that sets
enrichmentPromise = enrichMetadata(...) the first time artist/trackTitle (or
videoId-only fallback) are available, call startEnrichment() from the paths that
first determine artist/trackTitle (oEmbed success and the ytDlpEvent handler),
and ensure you keep a final fallback that only uses thumbnail if
startEnrichment() was never invoked; references: enrichmentPromise,
enrichMetadata, and the ytDlpEvent path (also adjust the analogous logic
referenced around lines 419-420).

🧹 Nitpick comments (3)

src/routes/+page.svelte (1)
35-40: Reuse EnrichedMetadata to avoid drift.

This local inline shape can silently diverge from the shared type (e.g., hasArtwork, trackNumber). Consider importing and using the shared type (or a Pick/Omit).
♻️ Suggested refactor
-import type { VideoPreview as VideoPreviewType } from "$lib/types";
+import type { EnrichedMetadata, VideoPreview as VideoPreviewType } from "$lib/types";
@@
-let enrichedMetadata = $state<{
-	album?: string;
-	year?: string;
-	genre?: string;
-	label?: string;
-} | null>(null);
+let enrichedMetadata = $state<EnrichedMetadata | null>(null);
src/lib/musicbrainz.ts (2)
150-152: Consider adding User-Agent header for consistency.

Unlike the other fetch calls in this module, fetchThumbnailArt doesn't include a User-Agent header. While YouTube thumbnails may not require it, adding it maintains consistency and is generally good practice for HTTP requests.
Proposed fix
 		const response = await fetch(url, {
 			signal: controller.signal,
+			headers: { "User-Agent": USER_AGENT },
 		});
60-106: Consider rate limiting for MusicBrainz API compliance.

The PR notes MusicBrainz rate limiting (1 req/sec per IP) as future work. For production use with concurrent users, consider implementing a queue or throttle mechanism to avoid API blocks. Libraries like p-limit or bottleneck can help manage this.

coderabbitai · 2026-01-24T02:39:12Z

+			album: release.title ?? "",
+			year: release.date?.slice(0, 4) ?? "",
+			genre: extractGenre(recording.tags),
+			trackNumber: release.media?.[0]?.track?.[0]?.number ?? "",


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n src/lib/musicbrainz.ts | head -120

Repository: jzstern/dub-rip

Length of output: 4005

🌐 Web query:

MusicBrainz API recording search track position offset documentation

💡 Result:

To filter recording search results by a track’s position/number on a medium, use the tnum field in the Lucene query:

tnum = “the position of the track on any medium including this recording (starts at 1, pre-gaps at 0)” [1]

Example (track 6, title “Blow Your Mind (Mwah)”):

GET /ws/2/recording?query=recording:"Blow Your Mind (Mwah)" AND tnum:6&fmt=json

To page through search results, use the offset parameter (this is not track position; it’s result paging):

limit = 1–100 results per request

offset = “Return search results starting at a given offset.” [2]

Example (second page of 25 results):

GET /ws/2/recording?query=recording:"Blow Your Mind (Mwah)"&limit=25&offset=25&fmt=json

Note: you may also see offset inside <track-list ... offset="..."> in release/recording lookup output examples; that offset is about paging within the returned track list, not search filtering by track position [3].

Sources: MusicBrainz RecordingSearch fields (tnum) [1], MusicBrainz API Search parameters (limit, offset) [2], Recording XML output examples showing track-list offset [3].

Track number extraction returns incorrect value.

Line 97 extracts the number of the first track on the first media disc, not the position of the searched recording within the release. For songs that aren't the first track, this will return "1" or another incorrect number.

The MusicBrainz API response structure doesn't include recording information in track data by default. To correctly identify which track corresponds to the searched recording, either:

Include inc=recordings parameter to embed recording IDs in track responses, then match against the searched recording

Or iterate through tracks to find the one matching the recording and extract its number

🤖 Prompt for AI Agents

In `@src/lib/musicbrainz.ts` at line 97, The current assignment to trackNumber uses release.media?.[0]?.track?.[0]?.number which always picks the first track; update the logic in the function that builds the release track info (referencing the release object and the trackNumber field) to find the track that matches the searched recording: either request recordings by adding the inc=recordings parameter to the MusicBrainz query and match by recording id, or iterate over release.media[*].track[*] to find the track whose recording id equals the searched recording id and then set trackNumber to that track.number (falling back to empty string if none found).

coderabbitai · 2026-01-24T02:39:12Z

+				let enrichmentPromise: Promise<EnrichmentResult> =
+					Promise.resolve({ metadata: null, artwork: null });
+
+				if (artist && trackTitle) {
+					enrichmentPromise = enrichMetadata(artist, trackTitle, videoId);
+				} else if (videoId) {
+					enrichmentPromise = enrichMetadata("", "", videoId);
+				}


⚠️ Potential issue | 🟠 Major

Avoid locking in empty enrichment before artist/title are known.

If oEmbed fails and artist/trackTitle are derived later via yt‑dlp, the current enrichmentPromise already ran with empty values and won’t re-run, so MusicBrainz metadata is skipped. Consider deferring the promise until artist/trackTitle are available, and only falling back to thumbnail if they never materialize.

🐛 Suggested fix (lazy start + final fallback)

-let enrichmentPromise: Promise<EnrichmentResult> = - Promise.resolve({ metadata: null, artwork: null }); - -if (artist && trackTitle) { - enrichmentPromise = enrichMetadata(artist, trackTitle, videoId); -} else if (videoId) { - enrichmentPromise = enrichMetadata("", "", videoId); -} +let enrichmentPromise: Promise<EnrichmentResult> | null = null; + +const startEnrichment = () => { + if (!enrichmentPromise && artist && trackTitle) { + enrichmentPromise = enrichMetadata(artist, trackTitle, videoId); + } +}; + +startEnrichment(); @@ -const enrichment = await enrichmentPromise; +const enrichment = + await (enrichmentPromise ?? enrichMetadata("", "", videoId));

Consider calling startEnrichment() when artist/trackTitle are first derived in the ytDlpEvent path so the lookup still runs in parallel once data becomes available.

Also applies to: 419-420

🧰 Tools

🪛 GitHub Actions: CI

[error] 219-221: Formatter mismatch: code would be reformatted. Run the formatter (e.g., Prettier) to fix formatting.

🤖 Prompt for AI Agents

In `@src/routes/api/download-stream/`+server.ts around lines 219 - 226, enrichmentPromise is eagerly resolved to empty metadata which prevents a later enrichMetadata(artist, trackTitle, videoId) run when artist/trackTitle are discovered via yt-dlp; change to a lazy starter: remove the immediate Promise.resolve and introduce a startEnrichment() helper that sets enrichmentPromise = enrichMetadata(...) the first time artist/trackTitle (or videoId-only fallback) are available, call startEnrichment() from the paths that first determine artist/trackTitle (oEmbed success and the ytDlpEvent handler), and ensure you keep a final fallback that only uses thumbnail if startEnrichment() was never invoked; references: enrichmentPromise, enrichMetadata, and the ytDlpEvent path (also adjust the analogous logic referenced around lines 419-420).

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

github-actions · 2026-01-24T02:43:11Z

🚂 Railway Preview Deployment

Environment	URL
`pr-44`	https://dub-rip-pr-44.up.railway.app

This preview deployment will be automatically cleaned up when the PR is closed.

_{Deployed by Railway • View logs}

devin-ai-integration

Devin Review found 1 new potential issue.

View issue and 11 additional flags in Devin Review.

devin-ai-integration

Devin Review found 1 new potential issue.

View issue and 12 additional flags in Devin Review.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/lib/musicbrainz.ts`:
- Around line 112-144: The current code reuses the same AbortController
(controller) and timeoutId for the initial search and the subsequent detail
lookups (fetchRecordingTags, fetchReleaseLabels), so if the search consumed most
of the timeout the follow-up requests get aborted immediately; fix this by
clearing the original timeout (clearTimeout(timeoutId)) once the search
completes and creating a new AbortController (e.g., detailController = new
AbortController()) with a fresh setTimeout for the remaining/next timeout
window, then pass detailController.signal to fetchRecordingTags and
fetchReleaseLabels (retain the conditional on recordingId) and ensure any
previous controller is cleaned up/aborted as appropriate.

♻️ Duplicate comments (1)

src/lib/musicbrainz.ts (1)
146-150: Track number extraction still uses the first track, not the matched recording.
The code pulls release.media?.[0]?.track?.[0]?.number, which often resolves to track 1 (and the search response may not even include track lists). Typically you need a release lookup with inc=recordings+media and then match the track whose recording id equals recordingId.
MusicBrainz recording search response fields: do recording search results include release media track lists? How to retrieve track positions for a recording (inc=recordings, inc=media) in release lookup?

coderabbitai · 2026-01-24T22:40:56Z

+	const controller = new AbortController();
+	const timeoutId = setTimeout(() => controller.abort(), timeout);
+
+	try {
+		const query = `recording:"${escapeLucene(title)}" AND artist:"${escapeLucene(artist)}"`;
+		const searchUrl = `${API_BASE}/recording?query=${encodeURIComponent(query)}&fmt=json&limit=5`;
+
+		const response = await fetch(searchUrl, {
+			signal: controller.signal,
+			headers: { "User-Agent": USER_AGENT },
+		});
+
+		if (!response.ok) return null;
+
+		const data = (await response.json()) as MusicBrainzSearchResponse;
+		const recordings = data.recordings;
+
+		if (!recordings || recordings.length === 0) return null;
+
+		const recording = recordings[0];
+		const releases = recording.releases;
+
+		if (!releases || releases.length === 0) return null;
+
+		const release = pickBestRelease(releases);
+		const recordingId = recording.id;
+
+		const [tags, label] = await Promise.all([
+			recordingId
+				? fetchRecordingTags(recordingId, controller.signal)
+				: Promise.resolve([]),
+			fetchReleaseLabels(release.id, controller.signal),
+		]);


⚠️ Potential issue | 🟡 Minor

Avoid sharing one timeout across search + detail lookups.

If the search call consumes most of the timeout, the tag/label lookups can be aborted immediately, yielding empty metadata even though the search succeeded. Consider resetting the timer (or using a new controller) before the follow-up calls.

💡 Possible adjustment

const controller = new AbortController(); const timeoutId = setTimeout(() => controller.abort(), timeout); try { // search... const response = await fetch(searchUrl, { signal: controller.signal, headers: { "User-Agent": USER_AGENT } }); if (!response.ok) return null; const data = (await response.json()) as MusicBrainzSearchResponse; + // reset timer for follow-up lookups + clearTimeout(timeoutId); + const detailController = new AbortController(); + const detailTimeoutId = setTimeout(() => detailController.abort(), timeout); const [tags, label] = await Promise.all([ - recordingId ? fetchRecordingTags(recordingId, controller.signal) : Promise.resolve([]), - fetchReleaseLabels(release.id, controller.signal), + recordingId ? fetchRecordingTags(recordingId, detailController.signal) : Promise.resolve([]), + fetchReleaseLabels(release.id, detailController.signal), ]); return { /* ... */ }; } finally { - clearTimeout(timeoutId); + clearTimeout(timeoutId); + clearTimeout(detailTimeoutId); }

🤖 Prompt for AI Agents

In `@src/lib/musicbrainz.ts` around lines 112 - 144, The current code reuses the same AbortController (controller) and timeoutId for the initial search and the subsequent detail lookups (fetchRecordingTags, fetchReleaseLabels), so if the search consumed most of the timeout the follow-up requests get aborted immediately; fix this by clearing the original timeout (clearTimeout(timeoutId)) once the search completes and creating a new AbortController (e.g., detailController = new AbortController()) with a fresh setTimeout for the remaining/next timeout window, then pass detailController.signal to fetchRecordingTags and fetchReleaseLabels (retain the conditional on recordingId) and ensure any previous controller is cleaned up/aborted as appropriate.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Start parallel MusicBrainz lookup when artist+title are known, await after download completes, send metadata SSE event, and apply enriched ID3 tags (album, year, genre, trackNumber, label, artwork) to MP3. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Fix Biome formatting of enrichmentPromise initialization - Escape all Lucene special characters (not just quotes) for queries like "AC/DC" or "P!nk" - Use imported EnrichedMetadata type instead of inline shape - Reset enrichedMetadata on SSE connection error - Add User-Agent header to fetchThumbnailArt for consistency - Fix User-Agent URL to correct repository Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When oEmbed fails to parse artist/title, the enrichment promise was started with empty strings (returning only a YouTube thumbnail). Now when yt-dlp later successfully parses artist/title from the filename, the enrichment promise is restarted with the actual values so MusicBrainz metadata is still fetched in the fallback path. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The MusicBrainz search endpoint does not return tags or label-info fields. Add separate lookup calls to /recording/{id}?inc=tags and /release/{id}?inc=labels after the initial search to retrieve genre and label data correctly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View issue and 12 additional flags in Devin Review.

Replace direct controller.close() call with the closeStream() helper at the file-not-found error path for consistency. The helper properly sets the isClosed flag and wraps the close in try-catch for error handling. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…search YouTube video titles often use "Artist & Other" or "Artist feat. Other" format for collaborations. MusicBrainz stores these differently (e.g., "Aries feat. brakence" in artist credits). Searching for the full collaborative name returns no results. Add extractPrimaryArtist() to split on common collaboration patterns (&, feat., ft., x, and, with, comma) and use only the primary artist name for the MusicBrainz search query. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

PR environments copy production variables, but the internal COBALT_API_URL (cobalt.railway.internal) is unreachable from isolated PR environments. Clearing it lets the app fall back to the public Cobalt API. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/railway-pr.yml:
- Around line 45-47: The workflow uses an invalid Railway CLI subcommand;
replace the incorrect call with the proper CLI flag for setting/clearing
variables by using the Railway variables command with --set for the
COBALT_API_URL variable (i.e., change the invocation that references
COBALT_API_URL to use the --set flag such as railway variables --set
COBALT_API_URL="" --service dub-rip), and confirm with the deployment
environment whether clearing to an empty string is acceptable or the variable
must be removed via the Railway dashboard.

coderabbitai · 2026-01-26T04:21:59Z

+      - name: Clear internal-only variables
+        run: |
+          railway variables set COBALT_API_URL="" --service dub-rip


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Railway CLI variables delete unset command documentation

💡 Result:

Railway’s official CLI reference currently documents railway variables as a single command to view variables and set variables via --set, and it does not document any delete, unset, or remove subcommand/flag. The documented interface is: railway variables [OPTIONS] with options like --service, --environment, --kv, and --set key=value. [1]

If you’re seeing references to railway variables delete/unset, those appear to come from third‑party CLI completion/docs (e.g., Fig) and don’t match Railway’s current published CLI API reference. [2]

What to do instead (per current docs):

Set/update: railway variables --set "KEY=value" [1]

For deletion: use the Railway dashboard (the CLI reference does not provide a delete/unset operation). [1]

Sources
[1] Railway Docs — CLI Reference (railway variables options) https://docs.railway.com/reference/cli-api
[2] Fig manual page showing railway variables delete/get/set (likely outdated vs official docs) https://www.fig.io/manual/railway/variables

🏁 Script executed:

# Search for railway-related files in the repo that might show correct usage rg -i "railway variables" --type yaml --type yml -A 2 -B 2

Repository: jzstern/dub-rip

Length of output: 86

Use correct Railway CLI syntax; delete subcommand does not exist.
railway variables delete is not a valid Railway CLI command. Per official Railway documentation, variable deletion must be performed via the Railway dashboard. For CLI-based clearing, use railway variables --set COBALT_API_URL="" --service dub-rip (note the --set flag). Verify with the deployment environment whether clearing via empty string is sufficient or if dashboard deletion is required.

🤖 Prompt for AI Agents

In @.github/workflows/railway-pr.yml around lines 45 - 47, The workflow uses an invalid Railway CLI subcommand; replace the incorrect call with the proper CLI flag for setting/clearing variables by using the Railway variables command with --set for the COBALT_API_URL variable (i.e., change the invocation that references COBALT_API_URL to use the --set flag such as railway variables --set COBALT_API_URL="" --service dub-rip), and confirm with the deployment environment whether clearing to an empty string is acceptable or the variable must be removed via the Railway dashboard.

Railway CLI rejects empty variable values. Use `railway variables delete` to remove the internal-only URL so the app falls back to the public Cobalt API default. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

`railway variables delete` silently failed. Use `railway variables set` with the public Cobalt API URL instead, which is the same approach used to manually fix the pr-44 environment. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Point PR environments to the production Cobalt instance via its public domain instead of the unreachable internal URL. Same API key, same instance, just accessible over public network. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View issue and 13 additional flags in Devin Review.

Add enrichedMetadata = null to the $effect reset block so stale metadata from a previous download doesn't persist in component state when the user changes the URL. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Increase yt-token-service heap limit from 512MB to 1024MB to prevent OOM crashes during jsdom-based BotGuard challenge solving - Increase PO token generation timeout from 30s to 60s for cold starts - Pre-warm PO token cache on server startup so the token is ready before the first user download request Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When MusicBrainz returns multiple recordings, the cover art was only checked for the first recording's release. Now collects release IDs from all matching recordings (albums first) and iterates through them until cover art is found on the Cover Art Archive. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The fetchPoToken() pre-warm imports youtube-po-token-generator (jsdom, ~500MB) at startup. On CI runners also running Playwright's Chromium, this OOMs the Vite process silently, causing a 120s webServer timeout. Pre-warming is only needed in production. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

charliecreates Bot requested a review from CharlieHelps January 24, 2026 02:32

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 24, 2026 02:32 Destroyed

charliecreates Bot reviewed Jan 24, 2026

View reviewed changes

charliecreates Bot removed the request for review from CharlieHelps January 24, 2026 02:34

devin-ai-integration Bot reviewed Jan 24, 2026

View reviewed changes

Comment thread src/lib/musicbrainz.ts

coderabbitai Bot reviewed Jan 24, 2026

View reviewed changes

cursor Bot reviewed Jan 24, 2026

View reviewed changes

Comment thread src/lib/musicbrainz.ts

Comment thread src/routes/api/download-stream/+server.ts

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 24, 2026 18:05 Destroyed

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 24, 2026 18:16 Destroyed

devin-ai-integration Bot reviewed Jan 24, 2026

View reviewed changes

Comment thread src/routes/api/download-stream/+server.ts

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 24, 2026 22:10 Destroyed

devin-ai-integration Bot reviewed Jan 24, 2026

View reviewed changes

Comment thread src/lib/musicbrainz.ts Outdated

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 24, 2026 22:36 Destroyed

coderabbitai Bot reviewed Jan 24, 2026

View reviewed changes

jzstern and others added 10 commits January 25, 2026 12:42

feat: add MusicBrainz API client for track metadata lookup

04cca4f

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

feat: add cover art fetching from Cover Art Archive

1cb115c

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

feat: add YouTube thumbnail fallback for cover art

1ee9ac3

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

feat: add EnrichedMetadata type and metadata stream event

15c347e

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

feat: display enriched metadata in download progress UI

6f3b2ba

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

test: add integration tests for metadata enrichment flow

815a8f7

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

jzstern force-pushed the feat/musicbrainz-metadata branch from 0cd610b to 44dc3f5 Compare January 25, 2026 19:44

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 25, 2026 19:44 Destroyed

devin-ai-integration Bot reviewed Jan 25, 2026

View reviewed changes

Comment thread src/routes/api/download-stream/+server.ts

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 25, 2026 19:54 Destroyed

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 25, 2026 20:11 Destroyed

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 26, 2026 04:15 Destroyed

coderabbitai Bot reviewed Jan 26, 2026

View reviewed changes

fix: delete COBALT_API_URL instead of setting empty string

997f371

Railway CLI rejects empty variable values. Use `railway variables delete` to remove the internal-only URL so the app falls back to the public Cobalt API default. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 26, 2026 04:35 Destroyed

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 26, 2026 06:22 Destroyed

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 26, 2026 06:50 Destroyed

devin-ai-integration Bot reviewed Jan 26, 2026

View reviewed changes

Comment thread src/routes/+page.svelte

fix: reset enrichedMetadata when download state is cleared

754c1bc

Add enrichedMetadata = null to the $effect reset block so stale metadata from a previous download doesn't persist in component state when the user changes the URL. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 26, 2026 06:59 Destroyed

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 26, 2026 07:51 Destroyed

railway-app Bot temporarily deployed to dub-rip / dub-rip-pr-44 January 26, 2026 08:02 Destroyed

railway-app Bot deployed to dub-rip / dub-rip-pr-44 January 27, 2026 18:01 View deployment

jzstern mentioned this pull request Apr 23, 2026

chore(deps): bump youtube-po-token-generator to 0.6.0 #51

Merged

Conversation

jzstern commented Jan 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Future work

How to test

Author checklist

Summary by CodeRabbit

Uh oh!

railway-app Bot commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly Related PRs

Poem

Uh oh!

charliecreates Bot left a comment

Choose a reason for hiding this comment

Overview

Key changes

Uh oh!

charliecreates Bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

charliecreates Bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

charliecreates Bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

charliecreates Bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚂 Railway Preview Deployment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

jzstern commented Jan 24, 2026 •

edited by coderabbitai Bot

Loading

railway-app Bot commented Jan 24, 2026 •

edited

Loading

coderabbitai Bot commented Jan 24, 2026 •

edited

Loading

github-actions Bot commented Jan 24, 2026 •

edited

Loading