Skip to content

search: add random downsampling to accelerate result processing #440

@eguguchkin

Description

@eguguchkin

When a search query returns millions of candidate documents, further processing becomes unnecessarily slow. In many cases, a smaller random sample (e.g., 1 000 instead of 1 000 000) preserves the statistical properties of the full result set while drastically reducing compute cost.

This PR introduces a Downsample parameter in SearchParams. If >1, the search pipeline randomly keeps only ~1/Downsample of LIDs before fetching MIDs/RIDs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions