Skip to content

feat: improve TMDB matching with Filmweb original titles and years#31

Merged
mfrszpiotro merged 10 commits into
mainfrom
feat/filmweb-original-titles
May 9, 2026
Merged

feat: improve TMDB matching with Filmweb original titles and years#31
mfrszpiotro merged 10 commits into
mainfrom
feat/filmweb-original-titles

Conversation

@mfrszpiotro
Copy link
Copy Markdown
Contributor

@mfrszpiotro mfrszpiotro commented May 8, 2026

Summary

  • Extracts original_title and production year from Filmweb showtimes pages.
  • Prioritizes original_title for TMDB searches to improve accuracy for international films.
  • Uses production year in TMDB searches to filter results.
  • Implements a +/- 1 year search range in TMDBScraper to handle release date discrepancies between Filmweb and TMDB.

Test Plan

  • Unit tests for Filmweb extraction logic with mock HTML (scraper/tests/test_filmweb_unit.py).
  • Unit tests for TMDB range search logic (scraper/tests/test_tmdb.py).
  • Integration test verifying original_title and year presence in scraped data (scraper/tests/test_filmweb.py).
  • All scraper tests passing (pytest scraper/tests/).

@mfrszpiotro mfrszpiotro linked an issue May 8, 2026 that may be closed by this pull request
@mfrszpiotro mfrszpiotro self-assigned this May 8, 2026
@mfrszpiotro
Copy link
Copy Markdown
Contributor Author

@mfrszpiotro mfrszpiotro merged commit 990ac4f into main May 9, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

improve filmweb scraper matching - "Vanishing"/"Zniknięcie" case

1 participant