Feature/sklearnex and remove faiss#33
Merged
Merged
Conversation
The FAISS KMeans backend added meaningful installation weight and startup import noise for a marginal benefit. Removing it simplies the backend selection logic to two cases: - cuML if GPU available - else sklearn Changes - Drop `faiss-cpu` & `faiss-gpu-cu12` from main deps and `gpu-*` extras - Remove FAISS from backend scripts `resolve_brackend()`, `run_kmeans()` dispatch - Remove "faiss" from clustering backend dropdowns in the webUI - Update README & BACKEND_PIPELINE doc to reflect the changes
Add `scikit-learn-intelex` as default dependency and patch sklearn at import time in `shared/utils/clustering.py`. Accelerates the existing `sklearn` PCA / TSNE / KMeans calls on CPU. UMAP is unaffected as `umap-learn` is not part of the `sklearn` algorithm. Set Set EMB_EXPLORER_DISABLE_SKLEARNEX=1 to opt out for debugging vanilla sklearn behavior.
NetZissou
commented
Jun 3, 2026
NetZissou
commented
Jun 3, 2026
Co-authored-by: Net Zhang <48858129+NetZissou@users.noreply.github.com>
egrace479
reviewed
Jun 3, 2026
Member
Co-Authored-By: egrace479 <egrace479@users.noreply.github.com>
Matrix: - OS: ubuntu-latest, windows-latest, macos-latest - Python: 3.10, 3.11, 3.12, 3.13 Each cell installs the CPU base (no gpu extras), runs an import smoke tests for shared/utils/clustering and both Streamlit app entry points, then verifies the sklearnex platform marker: present on x86_64/AMD64, absent on arm64. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closed
Collaborator
Author
|
@egrace479 thanks for the error logs and screenshots. Opened two seperate issues (#36 #35 )for further investigating. Applied the footnote suggestions locally to bypass the GH Web error that we encountered yesterday. Added CI Matrix to check for install and import on OS x Py Matrix:
|
egrace479
reviewed
Jun 10, 2026
egrace479
approved these changes
Jun 11, 2026
egrace479
left a comment
Member
There was a problem hiding this comment.
The emb-embed-explore error exists independent of this update and is tracked in #35.
This looks good.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Closes issue #32
sklearnwithsklearnex.faissentirely from backend.About
sklearnexGood API stability:
sklearnexis powered by the oneDAL library that provides accelerations on x86_64 Linux and Windows machines, and silently fall back to vanillasklearnon unsupported architectures like Apple Silicon and ARM Linux. The package is under the UXL Foundation (a Linux Foundation project) so cross-vendor support is a stated goal.