chore: refactor parse_new_files#41
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #41 +/- ##
==========================================
- Coverage 99.97% 99.97% -0.01%
==========================================
Files 101 103 +2
Lines 6984 6952 -32
Branches 258 220 -38
==========================================
- Hits 6982 6950 -32
Misses 1 1
Partials 1 1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This pull request centralizes the per-provider “parse newly discovered files” loop into a shared fintl.etl.engine.parse_utils.parse_new_files helper, and updates multiple provider parsers to delegate to it. This reduces duplicated directory creation, per-file parsing loops, and storage calls across providers.
Changes:
- Added
src/fintl/etl/engine/parse_utils.pyimplementing a genericparse_new_filesloop with optional per-file error catching and logging. - Refactored multiple provider modules (DKB, Postbank, GLS, Scalable) to use the shared utility.
- Added unit tests for
parse_utils, plus provider-level tests asserting parquet outputs and error propagation behavior.
Reviewed changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/fintl/etl/engine/parse_utils.py | New shared parsing loop utility with optional error catching/logging and pluggable store functions. |
| src/fintl/etl/providers/dkb/credit0.py | Delegates DKB credit parsing loop to parse_utils.parse_new_files. |
| src/fintl/etl/providers/dkb/festgeld0.py | Delegates DKB festgeld parsing loop to shared utility. |
| src/fintl/etl/providers/dkb/giro0.py | Delegates DKB giro parsing loop to shared utility. |
| src/fintl/etl/providers/dkb/giro202307.py | Delegates DKB giro202307 parsing loop to shared utility. |
| src/fintl/etl/providers/dkb/giro202312.py | Delegates DKB giro202312 parsing loop to shared utility. |
| src/fintl/etl/providers/dkb/tagesgeld0.py | Delegates DKB tagesgeld parsing loop to shared utility. |
| src/fintl/etl/providers/dkb/tagesgeld202307.py | Delegates DKB tagesgeld202307 parsing loop to shared utility. |
| src/fintl/etl/providers/dkb/tagesgeld202312.py | Delegates DKB tagesgeld202312 parsing loop to shared utility. |
| src/fintl/etl/providers/postbank/giro0.py | Delegates Postbank giro parsing loop to shared utility with caught parse errors. |
| src/fintl/etl/providers/postbank/giro202305.py | Delegates Postbank giro202305 parsing loop to shared utility with caught parse errors. |
| src/fintl/etl/providers/gls/giro0.py | Delegates GLS giro parsing loop to shared utility. |
| src/fintl/etl/providers/gls/credit0.py | Delegates GLS credit parsing loop to shared utility. |
| src/fintl/etl/providers/scalable/broker0.py | Delegates Scalable broker0 HTML parsing loop to shared utility (custom store fns). |
| src/fintl/etl/providers/scalable/broker20231028.py | Delegates Scalable broker20231028 HTML parsing loop to shared utility (custom store fns). |
| src/fintl/etl/providers/scalable/broker20260309.py | Delegates Scalable broker20260309 PNG parsing loop to shared utility with error catching and warning logs. |
| tests/etl/engine/test_parse_utils.py | New unit tests covering parse-utils directory creation, storage calls, and error handling behavior. |
| tests/etl/providers/dkb/test_dkb_credit0.py | Updates mocks to patch parse_utils.store_* (moved storage calls). |
| tests/etl/providers/dkb/test_dkb_festgeld0.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/dkb/test_dkb_giro0.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/dkb/test_dkb_giro202307.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/dkb/test_dkb_giro202312.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/dkb/test_dkb_tagesgeld0.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/dkb/test_dkb_tagesgeld202307.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/dkb/test_dkb_tagesgeld202312.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/postbank/test_postbank_giro0.py | Updates mocks to patch parse_utils.store_* (moved storage calls). |
| tests/etl/providers/postbank/test_postbank_giro202305.py | Updates mocks to patch parse_utils.store_*. |
| tests/etl/providers/gls/test_giro0.py | Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors. |
| tests/etl/providers/gls/test_credit0.py | Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors. |
| tests/etl/providers/scalable/test_scalable_broker0.py | Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors. |
| tests/etl/providers/scalable/test_scalable_broker20231028.py | Adds tests verifying parse_new_files writes expected parquet outputs and propagates errors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ServiceEnum, | ||
| ) | ||
| from fintl.etl.io.files.balances import store_balance | ||
| from fintl.etl.engine import parse_utils |
This pull request refactors the parsing logic for parsers by introducing a shared utility function for parsing new files. The repeated code in each parser is replaced with calls to this new utility, improving maintainability and reducing duplication.
Key changes:
Introduction of shared parsing utility:
parse_utils.pyinsrc/fintl/etl/enginecontaining a genericparse_new_filesfunction that handles parsing, error catching, and result storage for new files. This utility centralizes the logic previously duplicated across different parsers.