Add bundle support#9
Conversation
| """ | ||
| all_dataframes: list[pd.DataFrame] = [] | ||
| for bundle_link in bundles.links: | ||
| dfs = self.one(bundle_link) |
There was a problem hiding this comment.
| all_dataframes: list[pd.DataFrame] = [] | ||
| for bundle_link in bundles.links: | ||
| dfs = self.one(bundle_link) | ||
| all_dataframes.extend(dfs) |
There was a problem hiding this comment.
Using extend on DataFrame instead of append
The all method calls all_dataframes.extend(dfs) where dfs is the return value of one_bundle, which returns a single pd.DataFrame. Calling extend on a DataFrame iterates over its column names (strings), not the DataFrame itself. This would incorrectly populate all_dataframes with column name strings instead of DataFrames. The code likely intended to use append instead of extend.
| io.BytesIO(inner_file.read()), | ||
| inner_name, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Double context manager closes ZipFile prematurely
The _extract_multiple_files_from_zip method wraps the passed-in zfile parameter in a with zfile as zf: block, which closes the ZipFile when exiting. However, the caller at line 226-228 already has its own with ZipFile(bytes_io) as zfile: context manager. This results in the ZipFile being closed twice. While Python's ZipFile handles double-close gracefully, this pattern is error-prone and the inner with statement is unnecessary.
| df = pd.read_csv(io.BytesIO(content), compression="zip") | ||
| results.append(df) | ||
|
|
||
| df = pd.concat(results) |
There was a problem hiding this comment.
pd.concat on empty list raises ValueError
When keep_docs is empty after date filtering (no documents match the date range criteria), the results list remains empty. Calling pd.concat(results) on an empty list raises a ValueError: No objects to concatenate. This affects get_system_wide_actuals_docs and _get_reports. Unlike similar patterns earlier in the file that return an empty DataFrame when no docs are found, these methods proceed to concatenate without checking if results is empty.
Note
Introduces bundle downloads and refactors historical retrieval to support multi-file archives and MIS documents.
ERCOTArchiveBundlewithbundles(),one_bundle(), andall()to fetch and extract large archive bundlesERCOTArchivenow usesERCOT.make_request;fetch_historicalreturns a dict of DataFrames and parses nested zip-of-zips;ArchiveLinkconverted to@defineERCOTBase.make_requestfor authenticatedhttpxcalls reused by archive/bundle helpersget_60d_sced_disclosure,get_system_wide_actuals_docs,get_state_estimator_load_report,get_state_estimator_dc_ties_flows_report)get_dc_tie_flows→get_se_dc_tie_flows; replaceget_total_generationwithget_se_load;get_system_wide_actualsand 60‑day DAM/SCED disclosures now aggregate/return parsed MIS/archive data (DAM returns mapped dict of files)examples/demo.pyand notebooksreturnsdep andtyin dev; minor README/typos and small transform cleanupWritten by Cursor Bugbot for commit 937a0f9. This will update automatically on new commits. Configure here.