Skip to content

Telegram sync performance pass + AC4 codec support#2482

Open
Amonoman wants to merge 12 commits into
PixelPlayerHQ:masterfrom
Amonoman:master
Open

Telegram sync performance pass + AC4 codec support#2482
Amonoman wants to merge 12 commits into
PixelPlayerHQ:masterfrom
Amonoman:master

Conversation

@Amonoman

Copy link
Copy Markdown
Contributor

Improves Telegram sync performance and adds AC4 audio codec detection for cast.

Telegram sync performance

Telegram sync was taking about 10 minutes for a 100k-song channel. Three targeted fixes:

  • Parallelize per-song metadata refinement. File IO and ID3 tag parsing in syncTelegramData ran sequentially, one song at a time. Now runs concurrently (bounded semaphore, concurrency = 8) using the same pattern already used elsewhere in SyncWorker. Also removes a redundant duplicate resolveAlbumArtUri() call per song.
  • Skip redundant orphan-cleanup scans. incrementalSyncMusicData() ran two full-table NOT EXISTS scans on every chunk flush (about 200 times for 100k songs). Insert-only chunks can never produce an orphan, so cleanup is now skipped per-chunk (opt-out parameter, default unchanged for all other callers) and run once at the end instead.
  • Hoist a per-call regex in AudioMetadataReader.parseReplayGainDb() to a compiled-once constant, since it ran up to twice per song.

No behavior or output changes. Same songs, albums, artists, and cross-refs end up in the DB, just computed faster.

Tested: fresh install, 100k-song Telegram channel (6kbps MP3s with metadata, release build). Sync completed in about 50 seconds, down from about 10 minutes before these changes.

Cast

Adds AC4 audio codec detection for ISO BMFF containers in IsoBmffAudioCodecDetector, with unit tests.

Amonoman and others added 12 commits June 22, 2026 18:57
Support detecting AC4 audio (used in iOS/MOV containers) in
IsoBmffAudioCodecDetector, with corresponding unit tests.
Telegram sync was reading each song's file (file.exists() + ID3 tag
parse via AudioMetadataReader) one at a time in a sequential forEach,
inside syncTelegramData's chunk loop. For large channels (~100k songs)
this serial per-song file IO was the dominant cost, taking on the
order of 10 minutes.

Extract that IO/CPU-bound step and run it concurrently per chunk using
the same Semaphore + async/awaitAll pattern already used elsewhere in
this file (see the MediaStore processing phase), bounded by a new
TELEGRAM_METADATA_READ_CONCURRENCY = 8 to avoid saturating disk IO.
The sequential artist/album dedup maps that follow remain untouched,
since they mutate shared per-chunk state and aren't safe to
parallelize as-is.

Also drops a redundant resolveAlbumArtUri() call that was happening
twice per song when a local file was found (value doesn't change
between the two call sites).

No behavior change to the resulting song/album/artist data — only
the ordering/concurrency of how it's computed.

Co-Authored-By: Claude <noreply@anthropic.com>
incrementalSyncMusicData() ran deleteOrphanedAlbums() and
deleteOrphanedArtists() unconditionally at the end of every call.
Both are full-table NOT EXISTS scans against songs / cross-refs.
syncTelegramData() calls incrementalSyncMusicData() once per chunk
(~200 times for a 100k-song channel at the current chunk size), so
these scans ran ~200 times per sync even though pure-insert chunks
can never produce an orphaned album or artist — only the dedicated
deletion path can.

Add an optional cleanupOrphans parameter (default true, preserving
existing behavior for every other caller) and a new
cleanupOrphanedMusicData() convenience method. syncTelegramData()
now passes cleanupOrphans = false on every chunk flush and the
batched deletion loop, then calls cleanupOrphanedMusicData() once
after both finish.

Other incrementalSyncMusicData call sites (MediaStore sync, Netease
sync) are single-shot, not chunked loops, so they're unaffected and
keep the default.

Co-Authored-By: Claude <noreply@anthropic.com>
parseReplayGainDb() constructed a new Regex("(?i)[dD][bB]") on every
invocation. It's called via extractReplayGainDb() up to twice per
song read (track gain + album gain, each checking up to 3 property
key fallbacks), so a 100k-song sync could compile this same pattern
hundreds of thousands of times.

Hoist it to a private val compiled once at class load.

Investigated whether the two separate TagLib calls in read()
(getAudioProperties() + getMetadata(), each on a dup()'d fd) were
redundant and could be merged — confirmed against the upstream
Kyant0/taglib test suite that this is the library's required usage
pattern, not duplicated work, so left unchanged.
…or for SetTdlibParameters

TdApi.SetTdlibParameters was being constructed with a 15-argument
positional constructor, with a comment admitting the argument order
was guessed ("the order varies by version... I will try the most
common flat signature").

Verified against TDLib's official Java example and documentation:
SetTdlibParameters has no flat multi-arg constructor at all, only a
default constructor and one taking a single TdlibParameters object.
The previous code was very likely silently passing values into the
wrong fields on every app launch (e.g. apiId potentially landing in
systemLanguageCode's slot), since boolean/string parameters of the
wrong type would not fail at compile time given how the bindings are
laid out.

Replace with the documented pattern: construct TdApi.TdlibParameters()
and assign each field by name, which cannot misalign the way a wrong
guessed positional order can.

Also explicitly sets enableStorageOptimizer = true, which was not
reliably set by the old code (its effective value depended on exactly
how the broken positional mapping landed). This matches TDLib's
documented recommended default; flag for review if a different value
is preferred.
…solution

ForumTopicInfo's thread ID is resolved via reflection (field name
isn't confirmed for this tdlibx fork from available documentation).
Two debug logs ran on every topic during every forum sync: one
dumping all reflected field names, one logging the resolved field
name and value. Both were one-time discovery aids per their own
comments, not meant to run indefinitely.

Removed both. Kept the Timber.w fallback-path warning and Timber.e
failure log, since those flag situations actually worth knowing about.

Did not change the reflection logic itself or attempt to replace it
with direct property access: multiple TDLib Java bindings (official
tdlib/td, the tdlight fork, and TDLib's own GetForumTopic/
SearchChatMessages fields) consistently use "messageThreadId", which
is already the first name this code tries, but this exact tdlibx
1.8.56 fork's ForumTopicInfo source wasn't directly inspectable to
confirm a direct property reference would compile. If a past Logcat
capture from this debug log is available, it would let this be
replaced with a confirmed direct field reference instead.
TELEGRAM_METADATA_READ_CONCURRENCY was a flat constant (8) regardless
of device. Each concurrent read does a real native TagLib JNI call,
which is genuine CPU-bound work, not just blocking IO wait. Kotlin
dispatchers schedule threads, they don't reserve CPU time, so heavy
parallel CPU-bound work here can still starve the Main thread and
cause visible UI jank on a device with few cores, while being
needlessly conservative on a high-core device.

Replace the flat constant with telegramMetadataReadConcurrency(),
which scales with Runtime.getRuntime().availableProcessors() / 2,
capped to [2, 4]. This is an engineering default reasoned from how
coroutine dispatchers and native JNI calls interact with real CPU
scheduling, not a profiled or measured-optimal value; it has not
been verified against real device traces.

Prompted by a field report of heavy UI lag during a large Telegram
sync on a device that did not OOM (separate from the earlier OOM
investigation). Lowering concurrency trades some of the previously
measured sync-time improvement for reduced peak CPU contention; on
the device used for the 50-second/100k-song benchmark, this will
likely land at concurrency 4 (down from 8) and increase sync time
accordingly. Also tightens the artist/album preload in
syncTelegramData to build both derived maps in a single pass with
pre-sized capacity, instead of two separate default-capacity
.associate{} calls; this does not change the preload's fundamental
scaling with existing-library size, which remains a known, separate
limitation.
Three small, repeated costs found during a broader pass over
SyncWorker.kt, each scaling with library size:

- fetchMusicFromMediaStore: cursor.getString(dataCol) and
  cursor.getInt(trackCol) were each read twice per row (once for the
  directory filter / disc-vs-track split, once for the RawSongData
  fields). Cursor getters cross into native CursorWindow code, so
  this was a real, avoidable JNI call per row. Read once, reuse.

- fetchMediaStoreIds: allocated two File objects per row
  (File(data).parent, then File(parentPath).absolutePath) to arrive
  back at a string MediaStore's DATA column already provides as an
  absolute path. Replaced with the same string-slicing approach
  already used for the identical check in fetchMusicFromMediaStore.

- syncNeteaseData: toUnifiedNeteaseArtistId(name), a lowercase() +
  hashCode() computation, ran up to three times for the same artist
  name within the same song iteration (primary artist, the cross-ref
  loop, and the JSON artist refs). Computed once per song into
  artistIds and reused across all three.

No behavior change in any of the three; all are pure redundant-work
elimination, verified by re-deriving the same values via the same
inputs.
A previous commit replaced this call with TdApi.TdlibParameters()
field assignment, based on TDLib's official docs/example. That
approach does not compile against this exact tdlibx:td:1.8.56 build:
the real compiler error confirms TdApi.SetTdlibParameters here has
only a no-arg constructor and a flat 14-argument one, no constructor
taking a TdlibParameters object exists in this build at all.

Compared the original (pre-revert) flat-constructor call against the
compiler-confirmed real signature, argument type and position both
match exactly, so the original code was correct for this build. The
"argument order was guessed" comment that motivated the previous
change was based on uncertainty that, in retrospect, wasn't actually
a problem here.

Reverted to the flat constructor, now with the real, compiler-
verified signature documented in a comment instead of assumed from
docs for a different TDLib binding, and named local variables in
place of bare positional literals for review clarity.
fetchMusicFromMediaStore's incremental path queried
musicDao.getSongsByIdsListSimple() twice for every song that turned
out to have changed: once in Phase 2 to fetch its existing entity and
decide whether it changed, and once again in Phase 3 for the exact
same IDs to get that same entity for the user-edit-preserving merge
step.

Phase 2 already has the existing entity in hand right before
discarding it. Carry it forward as part of songsToProcess (now
List<Pair<RawSongData, SongEntity?>> instead of List<RawSongData>)
instead of re-fetching it in Phase 3. Removes one DB round-trip per
batch of changed songs; impact scales with how many songs actually
changed in a given sync (near-zero for a quiet incremental sync, real
for a deep/force-metadata rescan or any sync touching many files at
once).

No behavior change: same merge logic, same data, just sourced from
the pair instead of a second query.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant