refactor: optimize image handling and simplify module architecture#6
Conversation
…tching - Add calculateTextOverlap() to match Readability output fingerprints against CMS containers - Refactor findBestContentContainer() to use hybrid approach: text overlap + image count tiebreaker - Improves container detection on multi-column and gallery-heavy pages by better distinguishing article containers from sidebars - Add includeImages option to extractContent() and skip fallback when disabled for performance - Update content-script.js to pass includeImages setting through extraction pipeline
Consolidate image-related functions and eliminate unnecessary module layers: - Replace separate extractImagesFromHtml() and countImagesInHtml() with unified analyzeImagesInHtml() for single-pass extraction - Move image analysis into extractContent() to pre-populate images in extraction result (eliminates redundant parsing) - Unify generateFilename() and generateSubfolder() into single generateBaseName() function - Remove redundant platform.js wrapper layer, use EzyCopyFiles directly in content-script.js - Update content-script.js to consume pre-analyzed images from extraction result instead of extracting separately - Remove generatePageSubfolder() from file-helpers.js (replaced by generateBaseName) - Remove lib/platform.js from injection list - Update documentation to reflect simplified architecture - Fix popup.js toggle synchronization to handle both checkboxes This reduces code complexity, eliminates redundant DOM parsing, and streamlines the data flow.
|
Claude finished @gupsammy's task —— View job Code ReviewTodo List:
Review SummaryThis PR looks good to ship after fixing the image count bug below. The refactoring achieves its goal of eliminating redundant HTML parsing while simplifying the module architecture. CRITICAL Issues1. Image count calculation bug in
|
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
|
Claude encountered an error —— View job Re-reviewing PR ChangesTodo List:
✅ Critical bug fixed! Line 91 now correctly returns |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Claude finished @gupsammy's task —— View job Re-reviewing PR After Latest PushTodo List:
Final Review: ✅ Ready to MergeThe critical bug has been fixed correctly. Line 91 in lib/ezycopy.js now returns Verification SummaryCritical bug fix confirmed:
Architecture review:
No critical/high issues found. The refactoring achieves its performance goals without introducing bugs or Chrome Extension API violations. |
Summary
This refactor consolidates image handling logic and eliminates unnecessary module abstraction layers. The primary improvement is moving image analysis into extractContent() to return pre-analyzed image data, eliminating redundant HTML parsing when handling images later.
Key Changes
Files Changed
This reduces complexity, improves performance by eliminating redundant parsing, and streamlines the data flow from extraction to download.