Skip to content

chore: refactor parse_new_files#41

Merged
eschmidt42 merged 4 commits into
mainfrom
chore/refactor-parse_new_files
Jun 8, 2026
Merged

chore: refactor parse_new_files#41
eschmidt42 merged 4 commits into
mainfrom
chore/refactor-parse_new_files

Conversation

@eschmidt42

@eschmidt42 eschmidt42 commented Jun 8, 2026

Copy link
Copy Markdown
Owner

This pull request refactors the parsing logic for parsers by introducing a shared utility function for parsing new files. The repeated code in each parser is replaced with calls to this new utility, improving maintainability and reducing duplication.

Key changes:

Introduction of shared parsing utility:

  • Added a new module parse_utils.py in src/fintl/etl/engine containing a generic parse_new_files function that handles parsing, error catching, and result storage for new files. This utility centralizes the logic previously duplicated across different parsers.

@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.97%. Comparing base (890e5c6) to head (01c9848).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #41      +/-   ##
==========================================
- Coverage   99.97%   99.97%   -0.01%     
==========================================
  Files         101      103       +2     
  Lines        6984     6952      -32     
  Branches      258      220      -38     
==========================================
- Hits         6982     6950      -32     
  Misses          1        1              
  Partials        1        1              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request centralizes the per-provider “parse newly discovered files” loop into a shared fintl.etl.engine.parse_utils.parse_new_files helper, and updates multiple provider parsers to delegate to it. This reduces duplicated directory creation, per-file parsing loops, and storage calls across providers.

Changes:

  • Added src/fintl/etl/engine/parse_utils.py implementing a generic parse_new_files loop with optional per-file error catching and logging.
  • Refactored multiple provider modules (DKB, Postbank, GLS, Scalable) to use the shared utility.
  • Added unit tests for parse_utils, plus provider-level tests asserting parquet outputs and error propagation behavior.

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/fintl/etl/engine/parse_utils.py New shared parsing loop utility with optional error catching/logging and pluggable store functions.
src/fintl/etl/providers/dkb/credit0.py Delegates DKB credit parsing loop to parse_utils.parse_new_files.
src/fintl/etl/providers/dkb/festgeld0.py Delegates DKB festgeld parsing loop to shared utility.
src/fintl/etl/providers/dkb/giro0.py Delegates DKB giro parsing loop to shared utility.
src/fintl/etl/providers/dkb/giro202307.py Delegates DKB giro202307 parsing loop to shared utility.
src/fintl/etl/providers/dkb/giro202312.py Delegates DKB giro202312 parsing loop to shared utility.
src/fintl/etl/providers/dkb/tagesgeld0.py Delegates DKB tagesgeld parsing loop to shared utility.
src/fintl/etl/providers/dkb/tagesgeld202307.py Delegates DKB tagesgeld202307 parsing loop to shared utility.
src/fintl/etl/providers/dkb/tagesgeld202312.py Delegates DKB tagesgeld202312 parsing loop to shared utility.
src/fintl/etl/providers/postbank/giro0.py Delegates Postbank giro parsing loop to shared utility with caught parse errors.
src/fintl/etl/providers/postbank/giro202305.py Delegates Postbank giro202305 parsing loop to shared utility with caught parse errors.
src/fintl/etl/providers/gls/giro0.py Delegates GLS giro parsing loop to shared utility.
src/fintl/etl/providers/gls/credit0.py Delegates GLS credit parsing loop to shared utility.
src/fintl/etl/providers/scalable/broker0.py Delegates Scalable broker0 HTML parsing loop to shared utility (custom store fns).
src/fintl/etl/providers/scalable/broker20231028.py Delegates Scalable broker20231028 HTML parsing loop to shared utility (custom store fns).
src/fintl/etl/providers/scalable/broker20260309.py Delegates Scalable broker20260309 PNG parsing loop to shared utility with error catching and warning logs.
tests/etl/engine/test_parse_utils.py New unit tests covering parse-utils directory creation, storage calls, and error handling behavior.
tests/etl/providers/dkb/test_dkb_credit0.py Updates mocks to patch parse_utils.store_* (moved storage calls).
tests/etl/providers/dkb/test_dkb_festgeld0.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/dkb/test_dkb_giro0.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/dkb/test_dkb_giro202307.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/dkb/test_dkb_giro202312.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/dkb/test_dkb_tagesgeld0.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/dkb/test_dkb_tagesgeld202307.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/dkb/test_dkb_tagesgeld202312.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/postbank/test_postbank_giro0.py Updates mocks to patch parse_utils.store_* (moved storage calls).
tests/etl/providers/postbank/test_postbank_giro202305.py Updates mocks to patch parse_utils.store_*.
tests/etl/providers/gls/test_giro0.py Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors.
tests/etl/providers/gls/test_credit0.py Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors.
tests/etl/providers/scalable/test_scalable_broker0.py Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors.
tests/etl/providers/scalable/test_scalable_broker20231028.py Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/fintl/etl/engine/parse_utils.py
ServiceEnum,
)
from fintl.etl.io.files.balances import store_balance
from fintl.etl.engine import parse_utils
@eschmidt42 eschmidt42 merged commit c28f8ca into main Jun 8, 2026
4 checks passed
@eschmidt42 eschmidt42 deleted the chore/refactor-parse_new_files branch June 8, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants