Skip to content

Conversation

@simonselbig
Copy link

Adds the contributions from the AMOS WS2025 team (amos2025ws03-rtdip-timeseries-forecasting) to the RTDIP SDK:

  • Data Manipulation (pandas & spark): Chronological sort, cyclical encoding, datetime features, lag features, MAD outlier detection, rolling statistics, and more
  • Forecasting Models (spark): Prophet, LSTM, XGBoost, CatBoost, AutoGluon time series models + prediction evaluation utilities
  • Decomposition (pandas & spark): Classical, STL, and MSTL decomposition methods
  • Anomaly Detection (spark): IQR-based and MAD-based anomaly detection
  • Visualization (matplotlib & plotly): Forecasting, decomposition, anomaly detection, and model comparison plots
  • Data Sources: Azure Blob storage source

All components include tests and documentation.

Environment Changes

Added ML/forecasting dependencies: tensorflow, xgboost, plotly, prophet, sktime, catboost, autogluon.timeseries

Signed-off-by: simonselbig <simon.selbig@gmx.de>
Signed-off-by: simonselbig <simon.selbig@gmx.de>
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jan 25, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@Amber-Rigg Amber-Rigg added enhancement New feature or request pipelines Pipeline components and ingestion framework labels Jan 26, 2026
Signed-off-by: simonselbig <simon.selbig@gmx.de>
…ng instead of print statements

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
- Updated test files in the decomposition, forecasting, and visualization modules to replace np.random.seed with np.random.default_rng for improved randomness control.
- Ensured consistent random number generation across multiple test cases by initializing the random generator with a fixed seed.
- Adjusted assertions to use np.isclose for floating-point comparisons to enhance numerical stability in tests.
- Removed deprecated or commented-out code related to Prophet tests due to compatibility issues with Polars.

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
…te evaluate method to return None for invalid samples in CatBoost; enhance logging in LSTM predictions; modify test data generation for KNN and LSTM tests.

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
…MTimeSeries and remove redundant tests

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
…e64' is not supported. Pass e.g. 'datetime64[ns]' instead.

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
…and decomposition classes for improved DataFrame compatibility

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
… performance

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
…dd fixture for pandas compatibility with PySpark

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
…me column handling in DataFrames

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
- Removed unnecessary line breaks and adjusted formatting in multiple files to enhance code clarity.
- Simplified tuple unpacking in function calls across various modules.
- Cleaned up imports by removing unused blank lines.
- Standardized the formatting of dictionary assignments for better readability.

Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Signed-off-by: Amber-Rigg <amber.l.rigg25@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request pipelines Pipeline components and ingestion framework

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants