Skip to content

MB-62958: Binary Quantization#60

Open
Likith101 wants to merge 6 commits intoblevefrom
BQ
Open

MB-62958: Binary Quantization#60
Likith101 wants to merge 6 commits intoblevefrom
BQ

Conversation

@Likith101
Copy link
Member

@Likith101 Likith101 commented Dec 11, 2025

Support for Binary indexes (IVF and Flat) with backing indexes (SQ8 and Flat)

  • Added dist_compute for sq and flat backing indexes
  • Added relevant apis to read and write binary indexes
  • Added functionality to search binary indexes with params and selector
  • Added binary index api for size
  • Added relevant apis to do pre-filtered search on binary indexes

@Likith101 Likith101 changed the title (WIP) MB-62958: Binary Quantization MB-62958: Binary Quantization Jan 27, 2026
 - Added all necessary api connections for go-faiss needs
Copy link
Member

@abhinavdangeti abhinavdangeti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Likith101 Let's update the commit header and the description here capturing the functionality you're proposing here for which BQ classes.

}
CATCH_AND_HANDLE
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about faiss_IndexBinaryIVF_get_centroids_and_cardinality for the ObtainKCentroidCardinalitiesFromIVFIndex We should be supporting that API as well right?

store_pairs(store_pairs) {}

void set_query(const uint8_t* query_vector) override {
this->query_vector = query_vector; // Set the member directly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this line do? is it required to set the query_vector of BinaryInvertedListScanner?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up to this - doesn't the hammingComputer already use the query vector to compute the distance? i'm a bit confused if this a redundant var we're maintaining.

@abhinavdangeti
Copy link
Member

@Likith101 it seems there is code submitted upstream with facebookresearch#4761 (unfortunately not part of a release just yet) that has overlap with your proposal here.

Can I recommend cherrypicking that commit to this PR, so we wouldn't have to deal with too may merge conflicts in the area in the future when we bring in a later version tag.

store_pairs(store_pairs) {}

void set_query(const uint8_t* query_vector) override {
this->query_vector = query_vector; // Set the member directly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up to this - doesn't the hammingComputer already use the query vector to compute the distance? i'm a bit confused if this a redundant var we're maintaining.

using C = CMax<int32_t, idx_t>;

size_t nup = 0;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please remove these kind of empty lines?


dc->set_query(tmp.data());

// Compute the distance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another doubt at this point - i was just going through the code around hamming distance compute, and it looks like there's probably an API to compute the distance between a batch of codes? have you looked into using that instead of computing the distance every time?

hamdis_t hamming(const uint64_t* bs1, const uint64_t* bs2, size_t nwords);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

4 participants

Comments