[#71] perf: ~2x faster coverage quant (byte-identical) + lossy-input denoise (pigeon 17→12 colours)#72
Merged
Conversation
…nflation Two optimizations for the Tier 2 coverage engine, both verified to leave clean (PNG) output byte-identical to 0.4.0. PERF — perceptual-coverage quantization built its LAB-bin histogram with np.unique(..., axis=0), whose lexsort dominated runtime on large inputs. Encode each (L,a,b) bin into one integer key (L highest-order, a/b offset into range) and dedup with a 1-D np.unique. The key is monotonic in (L,a,b) so bins/inverse/counts are bit-for-bit identical to the lexsort — output is unchanged, ~2x faster (17-gradient 4.4s -> 2.4s, 12-sticker 2.6s -> 1.1s; all corpus images byte-identical). DENOISE (#71) — on lossy (JPEG/MPO) sources, compression noise inflates the CIELAB volume so coverage emits near-duplicate colours (pigeon 17 vs the reference's ~10). A *stronger* flatten paradoxically inflates the palette further, so for lossy inputs the coverage path swaps its default bilateral-flatten for a lighter, edge-preserving bilateral pre-filter (sigma_color 0.02): pigeon 17 -> 12 colours, outlines stay continuous, SSIM unchanged (0.896 -> 0.895), bytes smaller. Gated strictly to lossy sources — clean PNG/flat inputs never enter it and are byte-identical (verified by md5 on logo/gradient/flat). The source format is re-attached after load_image's convert() drops it. coverage_denoise_lossy=False is the safe-revert. Regression test asserts the gate is load-bearing (clean PNG identical with the option on vs off) and _is_lossy_source classification. Full suite green; ruff clean. Version 0.4.0 -> 0.4.1. Closes #71. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two optimizations for the Tier 2 coverage engine. Both leave clean (PNG) output byte-identical to 0.4.0 (verified by md5 on logo/gradient/flat).
Perf — ~2× faster coverage quantization, output unchanged
The LAB-bin histogram used
np.unique(..., axis=0), whose lexsort dominated runtime on large inputs. Encode each(L,a,b)bin into one integer key (L highest-order; a/b offset into range) and dedup with a 1-Dnp.unique. The key is monotonic in(L,a,b), so bins/inverse/countsare bit-for-bit identical to the lexsort.Denoise (#71) — cut lossy-input palette inflation
On lossy (JPEG/MPO) sources, compression noise inflates the CIELAB volume so coverage emits near-duplicate colours (pigeon 17 vs the reference's ~10). A stronger flatten paradoxically inflates the palette further (measured), so for lossy inputs the coverage path swaps its default bilateral-flatten for a lighter edge-preserving bilateral (
sigma_color=0.02):load_image'sconvert()drops it).coverage_denoise_lossy=Falseis the safe-revert.Note: the denoise bake-off initially favoured an additive bilateral pre-filter, but end-to-end testing showed that is redundant with the existing flatten (no effect in the real pipeline); the working fix is to replace flatten with the lighter bilateral for lossy inputs only.
Verification
_is_lossy_sourceclassification.z-design/.../test-v0.4.0/denoise-proto/pigeon_final_0.4.1.png.Version 0.4.0 → 0.4.1. Closes #71.