Skip to content

Backend Acceleration on CPU Cores #32

@NetZissou

Description

@NetZissou

About

scikit-learn-intelex is a stable Scikit-learn extension developed by Intel to seamlessly speed up traditional ML such as PCA, t-SNE, KMeans that are used in emb-explorer. In our benchmark, sklearnex patched code provided

  • ~4x speedup on PCA
  • ~7x speedup on t-SNE
  • ~12x speedup on KMeans

Tests are performed on OSC Ascend cluster 2 CPU cores, 16 GB RAM, on 2,000 x 768 float32 embeddings.

When GPU is available, cuML is set as the default backend for significant accelerations performance. However, when emb-explorer is being executed on machines that

  • does not have GPU available (standard workstations, hf demo space)
  • has GPU but not enough VRAM to fit data

CPU cores becomes the only option for projection & clustering operations. The emb-explorer backend resolves to sklearn.

Proposed Changes

  1. Patch sklearn with sklearnex.
  2. Remove faiss entirely from backend.

Currently, faiss is used only for KMeans acceleration. However, it adds meaningful installation weight and startup import noise for a marginal benefit.

With sklearnex patching, we can achieve great acceleration not only on KMeans, but also on projecting operations. Removing faiss simplifies the backend selection logic to two cases:

  • cuML if GPU is avaiable
  • sklearn with patching, otherwise

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions