Skip to content

chore(deps): update trafilatura requirement from >=1.0.0 to >=2.0.0#32

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/trafilatura-gte-2.0.0
Open

chore(deps): update trafilatura requirement from >=1.0.0 to >=2.0.0#32
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/trafilatura-gte-2.0.0

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot bot commented on behalf of github Apr 13, 2026

Updates the requirements on trafilatura to permit the latest version.

Release notes

Sourced from trafilatura's releases.

trafilatura-2.0.0

Breaking changes:

  • Python 3.6 and 3.7 deprecated (#709)
  • bare_extraction():
    • now returns an instance of the Document class by default
    • as_dict deprecation warning → use .as_dict() method on return value (#730)
  • bare_extraction() and extract(): no_fallback deprecation warning → use fast instead (#730)
  • downloads: remove decode argument in fetch_url() → use fetch_response instead (#724)
  • deprecated graphical user interface now removed (#713)
  • extraction: move max_tree_size parameter to settings.cfg (#742)
  • use type hinting (#721, #723, #748)
  • see Python and CLI deprecations in the docs

Fixes:

  • set options.source before raising error on empty doc tree by @​dmoklaf (#707)
  • robust encoding in options.source (#717)
  • more robust mapping for conversion to HTML (#721)
  • CLI downloads: use all information in settings file (#734)
  • downloads: cleaner urllib3 code (#736)
  • refine table markdown output by @​unsleepy22 (#752)
  • extraction fix: images in text nodes by @​unsleepy22 (#757)

Metadata:

  • more robust URL extraction (#710)

Command-line interface:

  • CLI: print URLs early for feeds and sitemaps with --list with @​gremid (#744)
  • CLI: add 126 exit code for high error ratio (#747)

Maintenance:

  • remove already deprecated functions and args (#716)
  • add type hints (#723, #728)
  • setup: use pyproject.toml file (#715)
  • simplify code (#708, #709, #727)
  • better debug messages in main_extractor (#714)
  • evaluation: review data, update packages, add magic_html (#731)
  • setup: explicit exports through __all__ (#740)
  • tests: extend coverage (#753)

Documentation:

Changelog

Sourced from trafilatura's changelog.

2.0.0

Breaking changes:

  • Python 3.6 and 3.7 deprecated (#709)
  • bare_extraction():
    • now returns an instance of the Document class by default
    • as_dict deprecation warning → use .as_dict() method on return value (#730)
  • bare_extraction() and extract(): no_fallback deprecation warning → use fast instead (#730)
  • downloads: remove decode argument in fetch_url() → use fetch_response instead (#724)
  • deprecated graphical user interface now removed (#713)
  • extraction: move max_tree_size parameter to settings.cfg (#742)
  • use type hinting (#721, #723, #748)
  • see Python and CLI deprecations in the docs

Fixes:

  • set options.source before raising error on empty doc tree by @​dmoklaf (#707)
  • robust encoding in options.source (#717)
  • more robust mapping for conversion to HTML (#721)
  • CLI downloads: use all information in settings file (#734)
  • downloads: cleaner urllib3 code (#736)
  • refine table markdown output by @​unsleepy22 (#752)
  • extraction fix: images in text nodes by @​unsleepy22 (#757)

Metadata:

  • more robust URL extraction (#710)

Command-line interface:

  • CLI: print URLs early for feeds and sitemaps with --list with @​gremid (#744)
  • CLI: add 126 exit code for high error ratio (#747)

Maintenance:

  • remove already deprecated functions and args (#716)
  • add type hints (#723, #728)
  • setup: use pyproject.toml file (#715)
  • simplify code (#708, #709, #727)
  • better debug messages in main_extractor (#714)
  • evaluation: review data, update packages, add magic_html (#731)
  • setup: explicit exports through __all__ (#740)
  • tests: extend coverage (#753)

Documentation:

1.12.2

  • downloads: add support for SOCKS proxies with @​gremid (#682)
  • extraction fix: ValueError in table spans (#685)

... (truncated)

Commits

@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Apr 13, 2026
@dependabot dependabot bot requested a review from dinesh-git17 as a code owner April 13, 2026 14:05
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Apr 13, 2026
Updates the requirements on [trafilatura](https://github.com/adbar/trafilatura) to permit the latest version.
- [Release notes](https://github.com/adbar/trafilatura/releases)
- [Changelog](https://github.com/adbar/trafilatura/blob/master/HISTORY.md)
- [Commits](adbar/trafilatura@v1.0.0...v2.0.0)

---
updated-dependencies:
- dependency-name: trafilatura
  dependency-version: 2.0.0
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/pip/trafilatura-gte-2.0.0 branch from c7e985e to 07b034e Compare April 13, 2026 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants