feat: enrich MP3 metadata via MusicBrainz lookup#44
Conversation
|
🚅 Deployed to the dub-rip-pr-44 environment in dub-rip
|
📝 WalkthroughWalkthroughAdds MusicBrainz-backed metadata and artwork enrichment (with YouTube-thumbnail fallback), tests for enrichment and fetch logic, integrates enrichment into download-stream (SSE metadata events and ID3 tagging), expands types and frontend UI, adjusts yt-token startup/timeout, and updates CI and test discovery. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Server as Server (download-stream)
participant MB as MusicBrainz API
participant CAA as Cover Art Archive
participant YT as YouTube
participant ID3 as ID3 Tagger
Client->>Server: POST /download (artist, title, videoId)
Server->>Server: start enrichmentPromise
par Enrichment Branch
Server->>MB: lookupTrack(artist, title)
alt MusicBrainz match
MB-->>Server: TrackMetadata (incl. releaseId)
Server->>CAA: fetchCoverArt(releaseId)
CAA-->>Server: imageBuffer + mime
else No MB match
MB-->>Server: null
Server->>YT: fetchThumbnailArt(videoId)
YT-->>Server: imageBuffer + mime
end
and Download Branch
Server->>Server: start audio stream / download
end
Server->>Server: resolve enrichmentPromise
Server->>Client: emit metadata event (album, year, genre, trackNumber, label, hasArtwork)
Server->>ID3: embed metadata + artwork (if present)
Server->>Client: stream audio response with ID3 tags
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
The enrichment pipeline works end-to-end, but there are behavioral mismatches between intent and implementation: lookupTrack() naively trusts recordings[0] (likely metadata mismatches in the wild), and the server always fetches a YouTube thumbnail when artist/title parsing fails, while also not emitting a metadata event in that case. The UI additionally defines an ad-hoc metadata shape instead of reusing EnrichedMetadata, increasing contract drift risk. Addressing these will improve correctness, performance, and long-term maintainability.
Additional notes (1)
- Readability |
src/lib/musicbrainz.ts:44-49
pickBestRelease()usesreleases[0]as a fallback. If MusicBrainz returns releases in a non-deterministic order (or an order that biases toward Singles/compilations), this can degrade quality significantly.
Since you already have a preference for Albums, it’s worth adding additional tie-breakers (e.g., earliest date or presence of track number/label).
Summary of changes
Overview
This PR adds parallel MusicBrainz-based metadata enrichment for downloaded MP3s and streams the enriched fields to the UI.
Key changes
- New MusicBrainz client in
src/lib/musicbrainz.tslookupTrack()queries MusicBrainz recording search and extractsalbum,year,genre,trackNumber,label,releaseId.fetchCoverArt()pulls release artwork from Cover Art Archive.fetchThumbnailArt()fetches YouTube thumbnail as artwork fallback.
- Streaming contract updates in
src/lib/types.ts- Adds
EnrichedMetadataand a new SSE event type:type: "metadata"withmetadatapayload.
- Adds
- Download pipeline integration in
src/routes/api/download-stream/+server.ts- Starts
enrichMetadata()in parallel once artist/title are known. - Sends a
metadataSSE event and writes additional ID3 tags (including artwork/APIC) before returning the MP3.
- Starts
- UI updates in
src/routes/+page.svelte- Stores and displays enriched metadata during download.
- Tests + Vitest config
- Adds unit/integration tests under
src/lib/*.test.ts. - Expands Vitest include globs to run tests in
src/**.
- Adds unit/integration tests under
| let enrichmentPromise: Promise<EnrichmentResult> = | ||
| Promise.resolve({ metadata: null, artwork: null }); | ||
|
|
||
| if (artist && trackTitle) { | ||
| enrichmentPromise = enrichMetadata(artist, trackTitle, videoId); | ||
| } else if (videoId) { | ||
| enrichmentPromise = enrichMetadata("", "", videoId); | ||
| } | ||
|
|
There was a problem hiding this comment.
enrichMetadata("", "", videoId) is effectively a no-op for MusicBrainz (since lookupTrack immediately returns null when artist/title are empty) but still always triggers the YouTube thumbnail fetch. That means for any video where parsing fails (or for non-music videos), you’ll still do an extra network call early, in parallel with the download. If the intent is “fallback artwork only when we’re confident we’re producing an MP3 and/or when enrichment is desired”, this is the wrong place to do it.
Also note: you later only send a metadata SSE event when enrichment.metadata exists, so in this branch the UI never learns hasArtwork: true even though artwork may be embedded.
Suggestion
Consider splitting the concerns:
- Only perform MusicBrainz lookup when
artist && trackTitle. - Perform thumbnail fallback later (or conditionally) if you actually need artwork and have a valid
videoId. - If you do embed thumbnail art without MB metadata, consider still emitting a
metadataevent with{ hasArtwork: true }so UI state matches the output file.
Example:
let enrichmentPromise = artist && trackTitle
? enrichMetadata(artist, trackTitle, videoId)
: Promise.resolve({ metadata: null, artwork: null });
// ...later, before writing tags
let enrichment = await enrichmentPromise;
if (!enrichment.artwork) {
enrichment = { ...enrichment, artwork: await fetchThumbnailArt(videoId) };
}
send({ type: "metadata", metadata: {
album: enrichment.metadata?.album,
year: enrichment.metadata?.year,
genre: enrichment.metadata?.genre,
trackNumber: enrichment.metadata?.trackNumber,
label: enrichment.metadata?.label,
hasArtwork: enrichment.artwork !== null,
}});Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.
| if (!recordings || recordings.length === 0) return null; | ||
|
|
||
| const recording = recordings[0]; | ||
| const releases = recording.releases; | ||
|
|
||
| if (!releases || releases.length === 0) return null; | ||
|
|
||
| const release = pickBestRelease(releases); | ||
|
|
There was a problem hiding this comment.
lookupTrack() always picks recordings[0] and then picks the “best release” from that recording’s releases. This can be wrong when MusicBrainz returns a first recording that is a close-but-incorrect match (covers, live versions, remasters, karaoke, etc.). In practice, this will cause metadata mismatches that are hard for users to understand.
Given the PR description says “picks the best album release”, the implementation currently only prefers album within a single recording, not best recording overall.
Suggestion
Improve the selection logic by scoring multiple recordings (you’re already limiting to 5) and then selecting the best release from the best recording. Even a simple heuristic helps a lot:
- Prefer recordings that have at least one
Albumrelease. - Prefer releases with a
date. - Prefer releases that include
media.track.number.
Sketch:
function scoreRecording(r: MusicBrainzRecording): number {
const releases = r.releases ?? [];
const hasAlbum = releases.some(x => x["release-group"]?.["primary-type"] === "Album");
const hasDate = releases.some(x => x.date);
const hasTrackNo = releases.some(x => x.media?.[0]?.track?.[0]?.number);
return (hasAlbum ? 100 : 0) + (hasDate ? 10 : 0) + (hasTrackNo ? 5 : 0);
}
const bestRecording = recordings
.slice()
.sort((a,b) => scoreRecording(b) - scoreRecording(a))[0];Then pick the best release from bestRecording.releases.
Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.
| export async function lookupTrack( | ||
| artist: string, | ||
| title: string, | ||
| timeout: number = DEFAULT_TIMEOUT, | ||
| ): Promise<TrackMetadata | null> { | ||
| if (!artist || !title) return null; | ||
|
|
||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), timeout); | ||
|
|
||
| try { | ||
| const query = `recording:"${escapeLucene(title)}" AND artist:"${escapeLucene(artist)}"`; | ||
| const url = `${API_BASE}?query=${encodeURIComponent(query)}&fmt=json&limit=5`; | ||
|
|
||
| const response = await fetch(url, { | ||
| signal: controller.signal, | ||
| headers: { "User-Agent": USER_AGENT }, | ||
| }); | ||
|
|
||
| if (!response.ok) return null; | ||
|
|
||
| const data = (await response.json()) as MusicBrainzResponse; | ||
| const recordings = data.recordings; | ||
|
|
||
| if (!recordings || recordings.length === 0) return null; | ||
|
|
||
| const recording = recordings[0]; | ||
| const releases = recording.releases; | ||
|
|
||
| if (!releases || releases.length === 0) return null; | ||
|
|
||
| const release = pickBestRelease(releases); | ||
|
|
||
| return { | ||
| album: release.title ?? "", | ||
| year: release.date?.slice(0, 4) ?? "", | ||
| genre: extractGenre(recording.tags), | ||
| trackNumber: release.media?.[0]?.track?.[0]?.number ?? "", | ||
| label: release["label-info"]?.[0]?.label?.name ?? "", | ||
| releaseId: release.id, | ||
| }; | ||
| } catch { | ||
| return null; | ||
| } finally { | ||
| clearTimeout(timeoutId); | ||
| } | ||
| } | ||
|
|
||
| export async function fetchCoverArt( | ||
| releaseId: string, | ||
| timeout: number = COVER_ART_TIMEOUT, | ||
| ): Promise<CoverArtResult | null> { | ||
| if (!releaseId) return null; | ||
|
|
||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), timeout); | ||
|
|
||
| try { | ||
| const url = `${COVER_ART_BASE}/${encodeURIComponent(releaseId)}/front-250`; | ||
|
|
||
| const response = await fetch(url, { | ||
| signal: controller.signal, | ||
| headers: { "User-Agent": USER_AGENT }, | ||
| }); | ||
|
|
||
| if (!response.ok) return null; | ||
|
|
||
| const arrayBuffer = await response.arrayBuffer(); | ||
| const mime = response.headers.get("Content-Type") ?? "image/jpeg"; | ||
|
|
||
| return { imageBuffer: Buffer.from(arrayBuffer), mime }; | ||
| } catch { | ||
| return null; | ||
| } finally { | ||
| clearTimeout(timeoutId); | ||
| } | ||
| } | ||
|
|
||
| export async function fetchThumbnailArt( | ||
| videoId: string, | ||
| timeout: number = COVER_ART_TIMEOUT, | ||
| ): Promise<CoverArtResult | null> { | ||
| if (!videoId) return null; | ||
|
|
||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), timeout); | ||
|
|
||
| try { | ||
| const url = `https://i.ytimg.com/vi/${encodeURIComponent(videoId)}/hqdefault.jpg`; | ||
|
|
||
| const response = await fetch(url, { | ||
| signal: controller.signal, | ||
| }); | ||
|
|
||
| if (!response.ok) return null; | ||
|
|
||
| const arrayBuffer = await response.arrayBuffer(); | ||
| const mime = response.headers.get("Content-Type") ?? "image/jpeg"; | ||
|
|
||
| return { imageBuffer: Buffer.from(arrayBuffer), mime }; | ||
| } catch { | ||
| return null; | ||
| } finally { | ||
| clearTimeout(timeoutId); | ||
| } | ||
| } |
There was a problem hiding this comment.
All fetch failures are swallowed via catch { return null; } in lookupTrack(), fetchCoverArt(), and fetchThumbnailArt(). That makes operational debugging difficult—especially since MusicBrainz rate-limits and may intermittently 503/429.
At minimum, distinguishing abort/timeouts from other failures (and optionally logging at debug level) would help you understand when enrichment is failing due to timeouts vs network vs status codes.
Suggestion
Capture and branch on error type:
- If
erris anAbortError, returnnullsilently. - Otherwise, log a concise message (or
Sentry.captureException) including which endpoint failed. - For non-OK responses, consider special-casing
429to avoid repeated retries and to make it observable.
Reply with "@CharlieHelps yes please" if you’d like me to add a commit that adds minimal error classification + optional Sentry reporting without spamming logs.
| let enrichedMetadata = $state<{ | ||
| album?: string; | ||
| year?: string; | ||
| genre?: string; | ||
| label?: string; | ||
| } | null>(null); |
There was a problem hiding this comment.
The page stores enrichedMetadata as an ad-hoc local object type that doesn’t include trackNumber or hasArtwork, even though the SSE contract (EnrichedMetadata) includes them. This increases drift risk between client/server and encourages partial UI updates to silently break when the server evolves.
Suggestion
Import and reuse the shared type from $lib/types to keep the contract aligned, and store the full EnrichedMetadata | null.
<script lang="ts">
import type { EnrichedMetadata } from "$lib/types";
let enrichedMetadata = $state<EnrichedMetadata | null>(null);
</script>Then you can optionally render trackNumber / hasArtwork later without changing state shape.
Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/routes/+page.svelte (1)
210-218: Clear metadata on SSE connection errors.On connection loss,
enrichedMetadatacan remain stale (Line 210 onward). Consider resetting it in the error handler.🐛 Suggested fix
eventSource.onerror = () => { if (!error) { error = "Connection lost"; } errorUrl = url; eventSource.close(); loading = false; status = ""; + enrichedMetadata = null; };
🤖 Fix all issues with AI agents
In `@src/lib/musicbrainz.ts`:
- Line 3: The USER_AGENT constant currently uses the wrong repo URL; update the
USER_AGENT string (constant USER_AGENT in src/lib/musicbrainz.ts) to reference
the correct GitHub repository/maintainer (e.g., "jzstern/dub-rip") or another
accurate contact per MusicBrainz API guidelines so the User-Agent contains
correct contact info.
- Around line 56-58: The escapeLucene function currently only escapes double
quotes; update it to escape all Lucene reserved characters so queries don't
break for names like "AC/DC" or "P!nk". In the escapeLucene function, replace
the simple double-quote replacement with logic that prepends a backslash to
every Lucene special character (including + - && || ! ( ) { } [ ] ^ ~ * ? : \ /
and "), ensuring the backslash itself is escaped in the pattern; use a single
regex or iteration over a character set and return the fully escaped string from
escapeLucene so callers (e.g., any code building Lucene queries) get a safe,
escaped value.
- Line 97: The current assignment to trackNumber uses
release.media?.[0]?.track?.[0]?.number which always picks the first track;
update the logic in the function that builds the release track info (referencing
the release object and the trackNumber field) to find the track that matches the
searched recording: either request recordings by adding the inc=recordings
parameter to the MusicBrainz query and match by recording id, or iterate over
release.media[*].track[*] to find the track whose recording id equals the
searched recording id and then set trackNumber to that track.number (falling
back to empty string if none found).
In `@src/routes/api/download-stream/`+server.ts:
- Around line 219-221: The CI failed due to formatting mismatch around the
initialization of enrichmentPromise (variable enrichmentPromise:
Promise<EnrichmentResult> = Promise.resolve({ metadata: null, artwork: null }));
run the project's formatter (e.g., Prettier/format script) on
src/routes/api/download-stream/+server.ts or invoke the repository's
pre-commit/format command to reformat this block so it matches the repository
style; ensure the single-line/spacing/indentation around enrichmentPromise and
its object literal follow the project's configured formatter rules and re-run
CI.
- Around line 219-226: enrichmentPromise is eagerly resolved to empty metadata
which prevents a later enrichMetadata(artist, trackTitle, videoId) run when
artist/trackTitle are discovered via yt-dlp; change to a lazy starter: remove
the immediate Promise.resolve and introduce a startEnrichment() helper that sets
enrichmentPromise = enrichMetadata(...) the first time artist/trackTitle (or
videoId-only fallback) are available, call startEnrichment() from the paths that
first determine artist/trackTitle (oEmbed success and the ytDlpEvent handler),
and ensure you keep a final fallback that only uses thumbnail if
startEnrichment() was never invoked; references: enrichmentPromise,
enrichMetadata, and the ytDlpEvent path (also adjust the analogous logic
referenced around lines 419-420).
🧹 Nitpick comments (3)
src/routes/+page.svelte (1)
35-40: ReuseEnrichedMetadatato avoid drift.This local inline shape can silently diverge from the shared type (e.g.,
hasArtwork,trackNumber). Consider importing and using the shared type (or aPick/Omit).♻️ Suggested refactor
-import type { VideoPreview as VideoPreviewType } from "$lib/types"; +import type { EnrichedMetadata, VideoPreview as VideoPreviewType } from "$lib/types"; @@ -let enrichedMetadata = $state<{ - album?: string; - year?: string; - genre?: string; - label?: string; -} | null>(null); +let enrichedMetadata = $state<EnrichedMetadata | null>(null);src/lib/musicbrainz.ts (2)
150-152: Consider adding User-Agent header for consistency.Unlike the other
fetchcalls in this module,fetchThumbnailArtdoesn't include aUser-Agentheader. While YouTube thumbnails may not require it, adding it maintains consistency and is generally good practice for HTTP requests.Proposed fix
const response = await fetch(url, { signal: controller.signal, + headers: { "User-Agent": USER_AGENT }, });
60-106: Consider rate limiting for MusicBrainz API compliance.The PR notes MusicBrainz rate limiting (1 req/sec per IP) as future work. For production use with concurrent users, consider implementing a queue or throttle mechanism to avoid API blocks. Libraries like
p-limitorbottleneckcan help manage this.
| album: release.title ?? "", | ||
| year: release.date?.slice(0, 4) ?? "", | ||
| genre: extractGenre(recording.tags), | ||
| trackNumber: release.media?.[0]?.track?.[0]?.number ?? "", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/lib/musicbrainz.ts | head -120Repository: jzstern/dub-rip
Length of output: 4005
🌐 Web query:
MusicBrainz API recording search track position offset documentation
💡 Result:
To filter recording search results by a track’s position/number on a medium, use the tnum field in the Lucene query:
tnum= “the position of the track on any medium including this recording (starts at 1, pre-gaps at 0)” [1]
Example (track 6, title “Blow Your Mind (Mwah)”):
GET /ws/2/recording?query=recording:"Blow Your Mind (Mwah)" AND tnum:6&fmt=json
To page through search results, use the offset parameter (this is not track position; it’s result paging):
limit= 1–100 results per requestoffset= “Return search results starting at a given offset.” [2]
Example (second page of 25 results):
GET /ws/2/recording?query=recording:"Blow Your Mind (Mwah)"&limit=25&offset=25&fmt=json
Note: you may also see offset inside <track-list ... offset="..."> in release/recording lookup output examples; that offset is about paging within the returned track list, not search filtering by track position [3].
Sources: MusicBrainz RecordingSearch fields (tnum) [1], MusicBrainz API Search parameters (limit, offset) [2], Recording XML output examples showing track-list offset [3].
Track number extraction returns incorrect value.
Line 97 extracts the number of the first track on the first media disc, not the position of the searched recording within the release. For songs that aren't the first track, this will return "1" or another incorrect number.
The MusicBrainz API response structure doesn't include recording information in track data by default. To correctly identify which track corresponds to the searched recording, either:
- Include
inc=recordingsparameter to embed recording IDs in track responses, then match against the searched recording - Or iterate through tracks to find the one matching the recording and extract its number
🤖 Prompt for AI Agents
In `@src/lib/musicbrainz.ts` at line 97, The current assignment to trackNumber
uses release.media?.[0]?.track?.[0]?.number which always picks the first track;
update the logic in the function that builds the release track info (referencing
the release object and the trackNumber field) to find the track that matches the
searched recording: either request recordings by adding the inc=recordings
parameter to the MusicBrainz query and match by recording id, or iterate over
release.media[*].track[*] to find the track whose recording id equals the
searched recording id and then set trackNumber to that track.number (falling
back to empty string if none found).
| let enrichmentPromise: Promise<EnrichmentResult> = | ||
| Promise.resolve({ metadata: null, artwork: null }); | ||
|
|
||
| if (artist && trackTitle) { | ||
| enrichmentPromise = enrichMetadata(artist, trackTitle, videoId); | ||
| } else if (videoId) { | ||
| enrichmentPromise = enrichMetadata("", "", videoId); | ||
| } |
There was a problem hiding this comment.
Avoid locking in empty enrichment before artist/title are known.
If oEmbed fails and artist/trackTitle are derived later via yt‑dlp, the current enrichmentPromise already ran with empty values and won’t re-run, so MusicBrainz metadata is skipped. Consider deferring the promise until artist/trackTitle are available, and only falling back to thumbnail if they never materialize.
🐛 Suggested fix (lazy start + final fallback)
-let enrichmentPromise: Promise<EnrichmentResult> =
- Promise.resolve({ metadata: null, artwork: null });
-
-if (artist && trackTitle) {
- enrichmentPromise = enrichMetadata(artist, trackTitle, videoId);
-} else if (videoId) {
- enrichmentPromise = enrichMetadata("", "", videoId);
-}
+let enrichmentPromise: Promise<EnrichmentResult> | null = null;
+
+const startEnrichment = () => {
+ if (!enrichmentPromise && artist && trackTitle) {
+ enrichmentPromise = enrichMetadata(artist, trackTitle, videoId);
+ }
+};
+
+startEnrichment();
@@
-const enrichment = await enrichmentPromise;
+const enrichment =
+ await (enrichmentPromise ?? enrichMetadata("", "", videoId));Consider calling startEnrichment() when artist/trackTitle are first derived in the ytDlpEvent path so the lookup still runs in parallel once data becomes available.
Also applies to: 419-420
🧰 Tools
🪛 GitHub Actions: CI
[error] 219-221: Formatter mismatch: code would be reformatted. Run the formatter (e.g., Prettier) to fix formatting.
🤖 Prompt for AI Agents
In `@src/routes/api/download-stream/`+server.ts around lines 219 - 226,
enrichmentPromise is eagerly resolved to empty metadata which prevents a later
enrichMetadata(artist, trackTitle, videoId) run when artist/trackTitle are
discovered via yt-dlp; change to a lazy starter: remove the immediate
Promise.resolve and introduce a startEnrichment() helper that sets
enrichmentPromise = enrichMetadata(...) the first time artist/trackTitle (or
videoId-only fallback) are available, call startEnrichment() from the paths that
first determine artist/trackTitle (oEmbed success and the ytDlpEvent handler),
and ensure you keep a final fallback that only uses thumbnail if
startEnrichment() was never invoked; references: enrichmentPromise,
enrichMetadata, and the ytDlpEvent path (also adjust the analogous logic
referenced around lines 419-420).
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
🚂 Railway Preview Deployment
This preview deployment will be automatically cleaned up when the PR is closed. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/lib/musicbrainz.ts`:
- Around line 112-144: The current code reuses the same AbortController
(controller) and timeoutId for the initial search and the subsequent detail
lookups (fetchRecordingTags, fetchReleaseLabels), so if the search consumed most
of the timeout the follow-up requests get aborted immediately; fix this by
clearing the original timeout (clearTimeout(timeoutId)) once the search
completes and creating a new AbortController (e.g., detailController = new
AbortController()) with a fresh setTimeout for the remaining/next timeout
window, then pass detailController.signal to fetchRecordingTags and
fetchReleaseLabels (retain the conditional on recordingId) and ensure any
previous controller is cleaned up/aborted as appropriate.
♻️ Duplicate comments (1)
src/lib/musicbrainz.ts (1)
146-150: Track number extraction still uses the first track, not the matched recording.
The code pullsrelease.media?.[0]?.track?.[0]?.number, which often resolves to track 1 (and the search response may not even include track lists). Typically you need a release lookup withinc=recordings+mediaand then match the track whose recording id equalsrecordingId.MusicBrainz recording search response fields: do recording search results include release media track lists? How to retrieve track positions for a recording (inc=recordings, inc=media) in release lookup?
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), timeout); | ||
|
|
||
| try { | ||
| const query = `recording:"${escapeLucene(title)}" AND artist:"${escapeLucene(artist)}"`; | ||
| const searchUrl = `${API_BASE}/recording?query=${encodeURIComponent(query)}&fmt=json&limit=5`; | ||
|
|
||
| const response = await fetch(searchUrl, { | ||
| signal: controller.signal, | ||
| headers: { "User-Agent": USER_AGENT }, | ||
| }); | ||
|
|
||
| if (!response.ok) return null; | ||
|
|
||
| const data = (await response.json()) as MusicBrainzSearchResponse; | ||
| const recordings = data.recordings; | ||
|
|
||
| if (!recordings || recordings.length === 0) return null; | ||
|
|
||
| const recording = recordings[0]; | ||
| const releases = recording.releases; | ||
|
|
||
| if (!releases || releases.length === 0) return null; | ||
|
|
||
| const release = pickBestRelease(releases); | ||
| const recordingId = recording.id; | ||
|
|
||
| const [tags, label] = await Promise.all([ | ||
| recordingId | ||
| ? fetchRecordingTags(recordingId, controller.signal) | ||
| : Promise.resolve([]), | ||
| fetchReleaseLabels(release.id, controller.signal), | ||
| ]); |
There was a problem hiding this comment.
Avoid sharing one timeout across search + detail lookups.
If the search call consumes most of the timeout, the tag/label lookups can be aborted immediately, yielding empty metadata even though the search succeeded. Consider resetting the timer (or using a new controller) before the follow-up calls.
💡 Possible adjustment
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), timeout);
try {
// search...
const response = await fetch(searchUrl, { signal: controller.signal, headers: { "User-Agent": USER_AGENT } });
if (!response.ok) return null;
const data = (await response.json()) as MusicBrainzSearchResponse;
+ // reset timer for follow-up lookups
+ clearTimeout(timeoutId);
+ const detailController = new AbortController();
+ const detailTimeoutId = setTimeout(() => detailController.abort(), timeout);
const [tags, label] = await Promise.all([
- recordingId ? fetchRecordingTags(recordingId, controller.signal) : Promise.resolve([]),
- fetchReleaseLabels(release.id, controller.signal),
+ recordingId ? fetchRecordingTags(recordingId, detailController.signal) : Promise.resolve([]),
+ fetchReleaseLabels(release.id, detailController.signal),
]);
return { /* ... */ };
} finally {
- clearTimeout(timeoutId);
+ clearTimeout(timeoutId);
+ clearTimeout(detailTimeoutId);
}🤖 Prompt for AI Agents
In `@src/lib/musicbrainz.ts` around lines 112 - 144, The current code reuses the
same AbortController (controller) and timeoutId for the initial search and the
subsequent detail lookups (fetchRecordingTags, fetchReleaseLabels), so if the
search consumed most of the timeout the follow-up requests get aborted
immediately; fix this by clearing the original timeout (clearTimeout(timeoutId))
once the search completes and creating a new AbortController (e.g.,
detailController = new AbortController()) with a fresh setTimeout for the
remaining/next timeout window, then pass detailController.signal to
fetchRecordingTags and fetchReleaseLabels (retain the conditional on
recordingId) and ensure any previous controller is cleaned up/aborted as
appropriate.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Start parallel MusicBrainz lookup when artist+title are known, await after download completes, send metadata SSE event, and apply enriched ID3 tags (album, year, genre, trackNumber, label, artwork) to MP3. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix Biome formatting of enrichmentPromise initialization - Escape all Lucene special characters (not just quotes) for queries like "AC/DC" or "P!nk" - Use imported EnrichedMetadata type instead of inline shape - Reset enrichedMetadata on SSE connection error - Add User-Agent header to fetchThumbnailArt for consistency - Fix User-Agent URL to correct repository Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When oEmbed fails to parse artist/title, the enrichment promise was started with empty strings (returning only a YouTube thumbnail). Now when yt-dlp later successfully parses artist/title from the filename, the enrichment promise is restarted with the actual values so MusicBrainz metadata is still fetched in the fallback path. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The MusicBrainz search endpoint does not return tags or label-info
fields. Add separate lookup calls to /recording/{id}?inc=tags and
/release/{id}?inc=labels after the initial search to retrieve genre
and label data correctly.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
0cd610b to
44dc3f5
Compare
Replace direct controller.close() call with the closeStream() helper at the file-not-found error path for consistency. The helper properly sets the isClosed flag and wraps the close in try-catch for error handling. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…search YouTube video titles often use "Artist & Other" or "Artist feat. Other" format for collaborations. MusicBrainz stores these differently (e.g., "Aries feat. brakence" in artist credits). Searching for the full collaborative name returns no results. Add extractPrimaryArtist() to split on common collaboration patterns (&, feat., ft., x, and, with, comma) and use only the primary artist name for the MusicBrainz search query. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
PR environments copy production variables, but the internal COBALT_API_URL (cobalt.railway.internal) is unreachable from isolated PR environments. Clearing it lets the app fall back to the public Cobalt API. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.github/workflows/railway-pr.yml:
- Around line 45-47: The workflow uses an invalid Railway CLI subcommand;
replace the incorrect call with the proper CLI flag for setting/clearing
variables by using the Railway variables command with --set for the
COBALT_API_URL variable (i.e., change the invocation that references
COBALT_API_URL to use the --set flag such as railway variables --set
COBALT_API_URL="" --service dub-rip), and confirm with the deployment
environment whether clearing to an empty string is acceptable or the variable
must be removed via the Railway dashboard.
| - name: Clear internal-only variables | ||
| run: | | ||
| railway variables set COBALT_API_URL="" --service dub-rip |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Railway CLI variables delete unset command documentation
💡 Result:
Railway’s official CLI reference currently documents railway variables as a single command to view variables and set variables via --set, and it does not document any delete, unset, or remove subcommand/flag. The documented interface is: railway variables [OPTIONS] with options like --service, --environment, --kv, and --set key=value. [1]
If you’re seeing references to railway variables delete/unset, those appear to come from third‑party CLI completion/docs (e.g., Fig) and don’t match Railway’s current published CLI API reference. [2]
What to do instead (per current docs):
- Set/update:
railway variables --set "KEY=value"[1] - For deletion: use the Railway dashboard (the CLI reference does not provide a delete/unset operation). [1]
Sources
[1] Railway Docs — CLI Reference (railway variables options) https://docs.railway.com/reference/cli-api
[2] Fig manual page showing railway variables delete/get/set (likely outdated vs official docs) https://www.fig.io/manual/railway/variables
🏁 Script executed:
# Search for railway-related files in the repo that might show correct usage
rg -i "railway variables" --type yaml --type yml -A 2 -B 2Repository: jzstern/dub-rip
Length of output: 86
Use correct Railway CLI syntax; delete subcommand does not exist.
railway variables delete is not a valid Railway CLI command. Per official Railway documentation, variable deletion must be performed via the Railway dashboard. For CLI-based clearing, use railway variables --set COBALT_API_URL="" --service dub-rip (note the --set flag). Verify with the deployment environment whether clearing via empty string is sufficient or if dashboard deletion is required.
🤖 Prompt for AI Agents
In @.github/workflows/railway-pr.yml around lines 45 - 47, The workflow uses an
invalid Railway CLI subcommand; replace the incorrect call with the proper CLI
flag for setting/clearing variables by using the Railway variables command with
--set for the COBALT_API_URL variable (i.e., change the invocation that
references COBALT_API_URL to use the --set flag such as railway variables --set
COBALT_API_URL="" --service dub-rip), and confirm with the deployment
environment whether clearing to an empty string is acceptable or the variable
must be removed via the Railway dashboard.
Railway CLI rejects empty variable values. Use `railway variables delete` to remove the internal-only URL so the app falls back to the public Cobalt API default. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
`railway variables delete` silently failed. Use `railway variables set` with the public Cobalt API URL instead, which is the same approach used to manually fix the pr-44 environment. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Point PR environments to the production Cobalt instance via its public domain instead of the unreachable internal URL. Same API key, same instance, just accessible over public network. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add enrichedMetadata = null to the $effect reset block so stale metadata from a previous download doesn't persist in component state when the user changes the URL. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Increase yt-token-service heap limit from 512MB to 1024MB to prevent OOM crashes during jsdom-based BotGuard challenge solving - Increase PO token generation timeout from 30s to 60s for cold starts - Pre-warm PO token cache on server startup so the token is ready before the first user download request Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When MusicBrainz returns multiple recordings, the cover art was only checked for the first recording's release. Now collects release IDs from all matching recordings (albums first) and iterates through them until cover art is found on the Cover Art Archive. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The fetchPoToken() pre-warm imports youtube-po-token-generator (jsdom, ~500MB) at startup. On CI runners also running Playwright's Chromium, this OOMs the Vite process silently, causing a 120s webServer timeout. Pre-warming is only needed in production. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Description
Enriches downloaded MP3s with album, year, genre, track number, record label, and cover art by looking up tracks on MusicBrainz. The lookup runs in parallel with the audio download, so there's no added latency. Falls back to YouTube thumbnail for artwork when no MusicBrainz match is found.
New ID3 tags added: album, year, genre, trackNumber, publisher (label), APIC (cover art)
Architecture: As soon as artist+title are parsed from the YouTube metadata, a parallel
enrichMetadata()promise fires off. It queries MusicBrainz's recording search API, picks the best album release, fetches cover art from the Cover Art Archive (or YouTube thumbnail as fallback), and applies everything as ID3 tags before the file is sent to the client.Cover art lookup iterates through all candidate release IDs across matching recordings (albums prioritized first) until artwork is found on the Cover Art Archive, maximizing the chance of getting official artwork.
Future work
How to test
bun run testAuthor checklist
This PR:
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.
Note
Adds metadata enrichment during downloads with parallel MusicBrainz lookup and artwork embedding.
src/lib/musicbrainz.ts: search recordings, prefer Album releases, collect candidate release IDs, fetch tags/labels, get cover art; fallback to YouTube thumbnaildownload-stream/+server.ts: run enrichment in parallel, emitmetadataSSE event, write ID3 tags (album,year,genre,trackNumber,publisher, APIC) and embed artwork+page.svelte): display enriched metadata during download; clear on reset/errorvitest.config.tsto includesrc/**/*.{test,spec}.tsand coverage thresholdsCOBALT_API_URLin Railway workflowWritten by Cursor Bugbot for commit 499ad14. This will update automatically on new commits. Configure here.