Skip to content

Lack of Native Temporal (Age-Based) search fields in PangaeaDataset.search_studies #35

@doswal

Description

@doswal

DESCRITPION:

The current implementation of PangaeaDataset.search_studies() relies on PanQuery for retrieving datasets from PANGAEA. However, PANGAEA’s search API does not support filtering datasets based on temporal coverage (Age) — unlike NOAA.

According to the PANGAEA search documentation:

  • The available "Year" field refers to publication year, not the temporal extent of the dataset.
  • There is no direct support for filtering datasets by Age (e.g., BP, CE) at the query level.

PROBLEM:

This creates a mismatch with the unified PyleoTUPS interface, where users expect:

ds.search_studies(earliest_year=..., latest_year=...)

to behave consistently across datasets (NOAA + PANGAEA).

Currently:

  • Temporal filters can not be applied
  • Even if some looped search is run, results may include datasets outside the requested temporal range
  • This breaks consistency and user expectations

Potential Approach (Not Implemented)

One possible direction to approximate temporal filtering:

  • Perform initial search using available filters
  • Extract temporal coverage from dataset metadata/tables
  • Apply post-search filtering based on: earliest_year, latest_year, time_format (CE/BP)
  • Iteratively fetch more results (via offset) until desired limit is reached

Challenges / Considerations

  • Temporal metadata is often incomplete or inconsistent in PANGAEA datasets
  • Extraction relies on parsing dataset tables → not always reliable
  • Additional API calls may impact performance
  • Results would be approximate, not guaranteed accurate

Additional Notes

This is a known limitation of PANGAEA search, not a bug in pyleotups
The goal is to implement NOAA-like behavior while maintaining transparency about limitations

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixThis will not be worked on

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions