Telegram sync performance pass + AC4 codec support#2482
Open
Amonoman wants to merge 12 commits into
Open
Conversation
Support detecting AC4 audio (used in iOS/MOV containers) in IsoBmffAudioCodecDetector, with corresponding unit tests.
Telegram sync was reading each song's file (file.exists() + ID3 tag parse via AudioMetadataReader) one at a time in a sequential forEach, inside syncTelegramData's chunk loop. For large channels (~100k songs) this serial per-song file IO was the dominant cost, taking on the order of 10 minutes. Extract that IO/CPU-bound step and run it concurrently per chunk using the same Semaphore + async/awaitAll pattern already used elsewhere in this file (see the MediaStore processing phase), bounded by a new TELEGRAM_METADATA_READ_CONCURRENCY = 8 to avoid saturating disk IO. The sequential artist/album dedup maps that follow remain untouched, since they mutate shared per-chunk state and aren't safe to parallelize as-is. Also drops a redundant resolveAlbumArtUri() call that was happening twice per song when a local file was found (value doesn't change between the two call sites). No behavior change to the resulting song/album/artist data — only the ordering/concurrency of how it's computed. Co-Authored-By: Claude <noreply@anthropic.com>
incrementalSyncMusicData() ran deleteOrphanedAlbums() and deleteOrphanedArtists() unconditionally at the end of every call. Both are full-table NOT EXISTS scans against songs / cross-refs. syncTelegramData() calls incrementalSyncMusicData() once per chunk (~200 times for a 100k-song channel at the current chunk size), so these scans ran ~200 times per sync even though pure-insert chunks can never produce an orphaned album or artist — only the dedicated deletion path can. Add an optional cleanupOrphans parameter (default true, preserving existing behavior for every other caller) and a new cleanupOrphanedMusicData() convenience method. syncTelegramData() now passes cleanupOrphans = false on every chunk flush and the batched deletion loop, then calls cleanupOrphanedMusicData() once after both finish. Other incrementalSyncMusicData call sites (MediaStore sync, Netease sync) are single-shot, not chunked loops, so they're unaffected and keep the default. Co-Authored-By: Claude <noreply@anthropic.com>
parseReplayGainDb() constructed a new Regex("(?i)[dD][bB]") on every
invocation. It's called via extractReplayGainDb() up to twice per
song read (track gain + album gain, each checking up to 3 property
key fallbacks), so a 100k-song sync could compile this same pattern
hundreds of thousands of times.
Hoist it to a private val compiled once at class load.
Investigated whether the two separate TagLib calls in read()
(getAudioProperties() + getMetadata(), each on a dup()'d fd) were
redundant and could be merged — confirmed against the upstream
Kyant0/taglib test suite that this is the library's required usage
pattern, not duplicated work, so left unchanged.
…or for SetTdlibParameters
TdApi.SetTdlibParameters was being constructed with a 15-argument
positional constructor, with a comment admitting the argument order
was guessed ("the order varies by version... I will try the most
common flat signature").
Verified against TDLib's official Java example and documentation:
SetTdlibParameters has no flat multi-arg constructor at all, only a
default constructor and one taking a single TdlibParameters object.
The previous code was very likely silently passing values into the
wrong fields on every app launch (e.g. apiId potentially landing in
systemLanguageCode's slot), since boolean/string parameters of the
wrong type would not fail at compile time given how the bindings are
laid out.
Replace with the documented pattern: construct TdApi.TdlibParameters()
and assign each field by name, which cannot misalign the way a wrong
guessed positional order can.
Also explicitly sets enableStorageOptimizer = true, which was not
reliably set by the old code (its effective value depended on exactly
how the broken positional mapping landed). This matches TDLib's
documented recommended default; flag for review if a different value
is preferred.
…solution ForumTopicInfo's thread ID is resolved via reflection (field name isn't confirmed for this tdlibx fork from available documentation). Two debug logs ran on every topic during every forum sync: one dumping all reflected field names, one logging the resolved field name and value. Both were one-time discovery aids per their own comments, not meant to run indefinitely. Removed both. Kept the Timber.w fallback-path warning and Timber.e failure log, since those flag situations actually worth knowing about. Did not change the reflection logic itself or attempt to replace it with direct property access: multiple TDLib Java bindings (official tdlib/td, the tdlight fork, and TDLib's own GetForumTopic/ SearchChatMessages fields) consistently use "messageThreadId", which is already the first name this code tries, but this exact tdlibx 1.8.56 fork's ForumTopicInfo source wasn't directly inspectable to confirm a direct property reference would compile. If a past Logcat capture from this debug log is available, it would let this be replaced with a confirmed direct field reference instead.
TELEGRAM_METADATA_READ_CONCURRENCY was a flat constant (8) regardless
of device. Each concurrent read does a real native TagLib JNI call,
which is genuine CPU-bound work, not just blocking IO wait. Kotlin
dispatchers schedule threads, they don't reserve CPU time, so heavy
parallel CPU-bound work here can still starve the Main thread and
cause visible UI jank on a device with few cores, while being
needlessly conservative on a high-core device.
Replace the flat constant with telegramMetadataReadConcurrency(),
which scales with Runtime.getRuntime().availableProcessors() / 2,
capped to [2, 4]. This is an engineering default reasoned from how
coroutine dispatchers and native JNI calls interact with real CPU
scheduling, not a profiled or measured-optimal value; it has not
been verified against real device traces.
Prompted by a field report of heavy UI lag during a large Telegram
sync on a device that did not OOM (separate from the earlier OOM
investigation). Lowering concurrency trades some of the previously
measured sync-time improvement for reduced peak CPU contention; on
the device used for the 50-second/100k-song benchmark, this will
likely land at concurrency 4 (down from 8) and increase sync time
accordingly. Also tightens the artist/album preload in
syncTelegramData to build both derived maps in a single pass with
pre-sized capacity, instead of two separate default-capacity
.associate{} calls; this does not change the preload's fundamental
scaling with existing-library size, which remains a known, separate
limitation.
Three small, repeated costs found during a broader pass over SyncWorker.kt, each scaling with library size: - fetchMusicFromMediaStore: cursor.getString(dataCol) and cursor.getInt(trackCol) were each read twice per row (once for the directory filter / disc-vs-track split, once for the RawSongData fields). Cursor getters cross into native CursorWindow code, so this was a real, avoidable JNI call per row. Read once, reuse. - fetchMediaStoreIds: allocated two File objects per row (File(data).parent, then File(parentPath).absolutePath) to arrive back at a string MediaStore's DATA column already provides as an absolute path. Replaced with the same string-slicing approach already used for the identical check in fetchMusicFromMediaStore. - syncNeteaseData: toUnifiedNeteaseArtistId(name), a lowercase() + hashCode() computation, ran up to three times for the same artist name within the same song iteration (primary artist, the cross-ref loop, and the JSON artist refs). Computed once per song into artistIds and reused across all three. No behavior change in any of the three; all are pure redundant-work elimination, verified by re-deriving the same values via the same inputs.
A previous commit replaced this call with TdApi.TdlibParameters() field assignment, based on TDLib's official docs/example. That approach does not compile against this exact tdlibx:td:1.8.56 build: the real compiler error confirms TdApi.SetTdlibParameters here has only a no-arg constructor and a flat 14-argument one, no constructor taking a TdlibParameters object exists in this build at all. Compared the original (pre-revert) flat-constructor call against the compiler-confirmed real signature, argument type and position both match exactly, so the original code was correct for this build. The "argument order was guessed" comment that motivated the previous change was based on uncertainty that, in retrospect, wasn't actually a problem here. Reverted to the flat constructor, now with the real, compiler- verified signature documented in a comment instead of assumed from docs for a different TDLib binding, and named local variables in place of bare positional literals for review clarity.
fetchMusicFromMediaStore's incremental path queried musicDao.getSongsByIdsListSimple() twice for every song that turned out to have changed: once in Phase 2 to fetch its existing entity and decide whether it changed, and once again in Phase 3 for the exact same IDs to get that same entity for the user-edit-preserving merge step. Phase 2 already has the existing entity in hand right before discarding it. Carry it forward as part of songsToProcess (now List<Pair<RawSongData, SongEntity?>> instead of List<RawSongData>) instead of re-fetching it in Phase 3. Removes one DB round-trip per batch of changed songs; impact scales with how many songs actually changed in a given sync (near-zero for a quiet incremental sync, real for a deep/force-metadata rescan or any sync touching many files at once). No behavior change: same merge logic, same data, just sourced from the pair instead of a second query.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improves Telegram sync performance and adds AC4 audio codec detection for cast.
Telegram sync performance
Telegram sync was taking about 10 minutes for a 100k-song channel. Three targeted fixes:
syncTelegramDataran sequentially, one song at a time. Now runs concurrently (bounded semaphore, concurrency = 8) using the same pattern already used elsewhere inSyncWorker. Also removes a redundant duplicateresolveAlbumArtUri()call per song.incrementalSyncMusicData()ran two full-tableNOT EXISTSscans on every chunk flush (about 200 times for 100k songs). Insert-only chunks can never produce an orphan, so cleanup is now skipped per-chunk (opt-out parameter, default unchanged for all other callers) and run once at the end instead.AudioMetadataReader.parseReplayGainDb()to a compiled-once constant, since it ran up to twice per song.No behavior or output changes. Same songs, albums, artists, and cross-refs end up in the DB, just computed faster.
Tested: fresh install, 100k-song Telegram channel (6kbps MP3s with metadata, release build). Sync completed in about 50 seconds, down from about 10 minutes before these changes.
Cast
Adds AC4 audio codec detection for ISO BMFF containers in
IsoBmffAudioCodecDetector, with unit tests.