Skip to content

Implement some more similarity metrics #181

@deven96

Description

@deven96
  • Hamming distance: Linear algo. The number of bits that need to be changed to convert one vector into the other. Fast and useful for binary vectors
  • Minkowski: Linear algo. It is a generalisation of Euclidean and some other distances like Manhattan i.e. by tuning some parameters, you get Euclidean/Manhattan or other distances
  • Locality Sensitive Hashing: Nonlinear algo. Works by grouping vectors into buckets by processing each vector through a hash function that maximizes hashing collision as opposed to minimizing as is usual with hashing functions. Not suitable for large dimensionality vectors
  • Hierarchical navigable small world: Nonlinear algo. An adaptation of navigable small world (NSW) graphs where an NSW graph is a graph structure containing vertices connected by edges to their nearest neighbors.Good for high dimensionality data

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions