Skip to content

Quant Lab demo #13

@rzimmerdev

Description

@rzimmerdev

Issue 1: Redesign Storage Layer for Structured Data

  • Replace JSON-per-key in HDF5 with structured tables (compound datasets).
  • Ensure datasets are appendable (maxshape=None, chunks=True).
  • Make columns explicitable (date, symbol, open, close, volume, factors).

Example: HDF5 compound dataset or Parquet + Polars.

Issue 2: Implement Asset Metadata System

  • Store asset metadata separately: symbol, class, region, sector, market cap, liquidity, currency, dividend yield, factor scores.
  • Allow fast filtering and universe selection without scanning time-series.
  • Maintain mapping between metadata and corresponding data storage.

Issue 3: Create Hierarchical Filtering Pipeline

  • Pipeline & Filter classes
  • Implement a multi-step filtering process before strategy/backtesting:
  • Remove illiquid/extreme assets
  • Select universe by strategy/asset class
  • Compute risk metrics, factor exposures
  • Feed filtered dataset into backtest or optimizer

Issue 4: Define Universe Templates

  • Predefine pools of assets for repeated strategy testing (e.g., “Global Equities”, “Brazil Bonds”, “Multi-asset ETFs”).
  • Templates should include filtering criteria: liquidity, size, asset class, region.
  • Ensure templates are easily selectable and interchangeable in backtests.

Issue 5: Strategy-Specific Views

  • Each strategy should work on its own filtered subset of the universe.

Examples: Momentum strategy → top 1000 liquid equities; Value strategy → equities by book-to-price ratio; Multi-asset → ETFs across classes/regions.

  • Supports modular strategy testing and avoids data contamination across strategies.

Issue 6: Incremental Updates & Factor Computation

  • Support appendable time-series updates.
  • Precompute and cache factor scores, correlations, and risk metrics monthly/quarterly.
  • Maintain optional index mapping symbols to file locations for fast access.

Issue 7: Integrate Queryable & Searchable Storage

  • Support efficient filtering, sorting, and selection on structured datasets.
  • try for HDF5 + PyTables or Parquet + Polars.
  • Include examples of common queries (filter by symbol/date, sort by factor).

Issue 8: Testing & Migration Plan

  • Plan migration from current JSON-based storage to new system.
  • Implement tests to ensure append, query, and filter operations return correct results.
  • Benchmark read/write speeds, especially for thousands of assets and years of daily data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions