Skip to content

Audio Backend#6824

Open
bob12224 wants to merge 20 commits intostashapp:developfrom
bob12224:audio
Open

Audio Backend#6824
bob12224 wants to merge 20 commits intostashapp:developfrom
bob12224:audio

Conversation

@bob12224
Copy link
Copy Markdown
Contributor

@bob12224 bob12224 commented Apr 13, 2026

This is a copy-paste setup for the backend to support AUDIO, similar to how Scene's and Images are handled.

Please see the docs/dev/AUDIO.md for the SCOPE and as a place for general discussion.

There is no new design, it copies form SCENES and then removes what is unrelated
Exception: FrameRate has been changed to SampleRate (FFPROBE already grabs this)

Relates to: #1258

TODOs that require Reviewer input

└─ stash
   ├─ audio.graphql
   │  ├─ line 11: TODO (audio|AudioCaption): need to update IF AudioCaption required
   │  └─ line 29: TODO (audio|AudioCaption): need to update IF AudioCaption required
   ├─ resolver_model_audio.go
   │  └─ line 119: TODO (audio|AudioCaption): need to update IF AudioCaption required
   ├─ audio.go
   │  └─ line 84: TODO (audio): can we return no urls?
   ├─ config.go
   │  └─ line 857: TODO (audio): update this to AudioFileNamingAlgorithm?
   ├─ fingerprint.go
   │  └─ line 67: TODO (audio): should Audio's also use OSHash instead of md5 for default (if so, then will need to update Audios)
   ├─ manager_tasks.go
   │  └─ line 34: TODO (audio): figure out this IF condition
   ├─ repository.go
   │  └─ line 30: TODO (audio): is this only used for stashbox?
   ├─ delete.go
   │  └─ line 42: TODO (future|audio generated files): add paths here
   ├─ migrate_hash.go
   │  └─ line 1: TODO (audio): should this file be deleted since there are no transcodes?
   ├─ scan.go
   │  └─ line 46: TODO (audio): this PR has no generation
   ├─ browser.go
   │  └─ line 91: TODO (audio): do we need to check ProbeAudioCodec for audio containers? (i.e. can `.mp3` contain a codec we need to transcode for?
   ├─ container.go
   │  └─ line 17: TODO (audio): better way to do this, without suffix this clashes with `Mp3 ProbeAudioCodec`
   ├─ caption.go
   │  └─ line 1: TODO (audio): Can this file be deleted if we utilize VideoCaptions?
   ├─ caption_test.go
   │  └─ line 1: TODO (audio): Can this file be deleted if we utilize audioCaptions?
   ├─ model_audio.go
   │  └─ line 235: TODO (audio): don't know if we need this, using VideoCaption for now due to `pkg/models/repository_file.go` and `FileReader` using
   └─ file.go
      └─ line 27: TODO (audio|AudioCaption): need to update IF AudioCaption required

Comment thread docs/dev/AUDIO.md
Copy link
Copy Markdown
Collaborator

@Gykes Gykes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to preface this by saying I'm not the main dev and, quite frankly, have limited experience with design implementation so take what I say with a grain of salt. If something sounds dumb or wrong it probably is and feel free to call me on it.

Comment thread docs/dev/AUDIO.md
Comment thread docs/dev/AUDIO.md Outdated
- duration
- audio codec
- OPTIONAL (can be added now or later)
- channels (mono, stereo, 5.1, 7.1)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would tthis really be needed? Currently scenes have audio and we don't track this at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest looking at bitrate again, as this is tracked for scenes (the video portion).
And it might be better to add now, instead of planning to add later.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly I was referring to the channels. I think bitrate would be fine.

Comment thread pkg/sqlite/migrations/86_audio.up.sql Outdated
`created_at` datetime not null,
`updated_at` datetime not null,
`code` text,
`artists` text,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to go with performers this would need to be changed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment thread pkg/sqlite/migrations/86_audio.up.sql Outdated
`updated_at` datetime not null,
`code` text,
`artists` text,
`album` text,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Studio was in the Audio Metadata but album is used here. How and where would we store albums? In the case of audio, studios seem like the best fit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

`title` varchar(255),
`details` text,
`date` date,
`rating` tinyint,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probs better to use rating100 similar to scenes.

Copy link
Copy Markdown
Contributor Author

@bob12224 bob12224 Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the migrations for scenes uses rating (ref: pkg/sqlite/migrations/1_initial.up.sql). I can only find rating100 in the graphql, which I do provide in graphql/schema/types/audio.graphql (line 51)

Comment thread pkg/sqlite/migrations/86_audio.up.sql Outdated
foreign key(`audio_id`) references `audios`(`id`) on delete cascade,
PRIMARY KEY("group_id", `audio_id`)
);
CREATE INDEX `index_movies_audios_on_movie_id` on "groups_audios" ("group_id");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misnamed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to index_group_audios_on_group_id

@rezreal
Copy link
Copy Markdown

rezreal commented Apr 16, 2026

tbh there is no reason why audios should not just be scenes and why there would be a dedicated section for them.

Lots of errors to fix and TODO notes
Comment thread .golangci.yml
- Removed funscripts, they are for interactive
- updated the scanner to correctly create `audio_files` row
- Adding Audio to `paths`
- Updated sqlite to add AudioFile

Need to update mutations next
New instructions since newest go version does not work well with v1 of linters
Comment thread docs/DEVELOPMENT.md Outdated
- removed todos
- tested mutations
- updated the autotagger
- added missing sort options
  - open question generated from this
@bob12224 bob12224 marked this pull request as ready for review April 28, 2026 00:51
@bob12224
Copy link
Copy Markdown
Contributor Author

@Gykes I can see nothing else required for this PR. In your court. Please see my TODO notes that require your input

@Gykes
Copy link
Copy Markdown
Collaborator

Gykes commented Apr 29, 2026

Will give it a gander tomorrow probably. I don't have the brain power right now to look over a 13k addition lol

@bob12224
Copy link
Copy Markdown
Contributor Author

bob12224 commented May 3, 2026

Will give it a gander tomorrow probably. I don't have the brain power right now to look over a 13k addition lol

Any luck, haven't seen any PR comments yet

@Gykes
Copy link
Copy Markdown
Collaborator

Gykes commented May 3, 2026

Will give it a gander tomorrow probably. I don't have the brain power right now to look over a 13k addition lol

Any luck, haven't seen any PR comments yet

I have various thoughts lined up. Been thinking on stuff the last few days passively. I will start putting my thoughts in now so you can think on it as well.

Copy link
Copy Markdown
Collaborator

@Gykes Gykes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just initial thoughts that I remembered off the top of my head. Adding testing would probably be a good idea. Review process will be slow on this. Sorry but prepping to move cross country and time is becoming more and more limited.

I went over some of your TODO questions in the review. Also I think, if we are doing transcoding in a separate PR, then we should probably rip out the commented trans coding support sections. Currently all it is doing is increasing LOC count.

--

-- TODO(audio): think of better name for this, too close to `audios_files`
CREATE TABLE `audio_files` (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

audio_file_info? Not married to this was just the first thing that popped into my head. Happy to hear recs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • files_audio (would need to migrate video_files -> files_video to match naming convention)
  • audio_only_files
  • audiofiles (not a fan)

Comment on lines +33 to +36
if instance.Config.IsCreateImageClipsFromVideos() && stash != nil && stash.ExcludeVideo {
// TODO(audio): figure out this IF condition
return isImage(pathname) || isVideo(pathname)
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this even need to exist?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by this as well and would value input here

Comment thread internal/manager/audio.go
var (
directAudioEndpointType = endpointType{
label: "Direct stream",
mimeType: ffmpeg.MimeMp3Audio,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hardcode this for all direct streams wont that also serve all other formats? Flac,m4a, etc. This could break audio playback. We could resolve this at the container instead

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from internal/manager/scene.go, I don't see any benefit making audio different from scenes unless it requires it to meet feature-set.

Copied to make it easier to compare:

directEndpointType = endpointType{
  label:     "Direct stream",
  mimeType:  ffmpeg.MimeMp4Video,
  extension: "",
}

Comment thread internal/manager/audio.go
endpoints = append(endpoints, makeStreamEndpoint(directAudioEndpointType))
}

// TODO(audio): can we return no urls?
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the scheme declares a non null array. You would need to return it as an empty slice. Something like []*AudioStreamEndpoint{}

Comment thread graphql/schema/types/audio.graphql Outdated
title: String
code: String
details: String
url: String @deprecated(reason: "Use urls")
Copy link
Copy Markdown
Collaborator

@Gykes Gykes May 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably remove this from the audio side. It was never implemented on here so don't see the need for it to be deprecated but there's not issue with having it, just extra LOC.

EDIT: This could apply to the other deprecated fields as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment thread pkg/audio/migrate_hash.go Outdated
@@ -0,0 +1,39 @@
// TODO(audio): should this file be deleted since there are no transcodes?
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are doing transcodes at a later date then yea, should probably remove this to keep scope and LOC down.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File should continue existing, it is used to find duplicate files with the same hash and merge them into the same audio

Comment thread pkg/ffmpeg/ffprobe.go
Comment on lines +512 to +527
func (a *AudioFile) getStreamIndex(fileType string, probeJSON FFProbeJSON) int {
ret := -1
for i, stream := range probeJSON.Streams {
// skip cover art/thumbnails
if stream.CodecType == fileType && stream.Disposition.AttachedPic == 0 {
// prefer default stream
if stream.Disposition.Default == 1 {
return i
}

// backwards compatible behaviour - fallback to first matching stream
if ret == -1 {
ret = i
}
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a copy of the video version at roughly line 284. Could break this out to a helper to reduce LOC changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this, but I think that might make things more confusing. Since this is a tradition copy-paste+rename PR, making new changes would require more thinking than just accepting that what works for Scenes will work for Audio.

Perhaps we can defer this change to when the feature is more accepted and new audio specific features are added (transcodes/waveform sprites/ui/etc)?

@Gykes
Copy link
Copy Markdown
Collaborator

Gykes commented May 8, 2026

I see you making review comments. I am currently beginning my move and wont be setup back up fully till around the 16th. I still have my laptop and MAY get to them if I'm bored or have time to kill during transit. I haven't forgotten but just ping me after the 15th if you think I have

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants