Skip to content

Simplified lots of traits#238

Closed
nishaq503 wants to merge 14 commits into
URI-ABD:masterfrom
nishaq503:trait-simplification
Closed

Simplified lots of traits#238
nishaq503 wants to merge 14 commits into
URI-ABD:masterfrom
nishaq503:trait-simplification

Conversation

@nishaq503

@nishaq503 nishaq503 commented Sep 22, 2025

Copy link
Copy Markdown
Collaborator

This PR greatly simplifies a lot of traits in abd-clam as well as the public API. A summary of the changes is as follows:

  1. abd-clam no longer depends on distances and does not make use of the Number trait. Instead we now use the num crate and have a DistanceValue trait with no required methods and a blanket implementation for any type that implements its super-traits.
  2. The Metric trait and its implementations have been removed. Anything that was generic over a Metric is now generic over functions using a Fn(&I, &I) -> T signature, where I is an item from a Dataset and T is a DistanceValue. This allows users much more freedom in defining their metrics as functions and they no longer have to first shoehorn them into structs and then implement the old Metric trait for their structs.
  3. The Dataset trait has been greatly simplified. We now provide a blanket implementation of Dataset for any type that implements AsRef<[I]> and AsMut<[I]>, i.e. any type that behaves like a slice of items of type I. This includes standard collections like Vecs and arrays. Dataset has no super-traits and its only required methods are get, get_mut and cardinality, making it exceedingly easy to implement for other types.
  4. All dependencies except those relating to the python wrapper for distances have been upgraded. The only exception is bitcode and the reason is noted in the root Cargo.toml.
  5. The disk-io feature in abd-clam is no more. The relevant functionality is now included by default. I may end up changing this in a future PR after studying how other notable crates provide a serde feature.
  6. Building a Cluster no longer uses a random seed or any internal randomness. If the user wants comparable functionality, they can simply shuffle their dataset before using it to build a tree.
  7. The Adapter trait has been removed because it was too complicated to work with. For now, all adapted Cluster types in the crate have a method called from_cluster_tree. If we come up with a simpler idea, we can start using it.
  8. The BalancedBall struct and balanced clustering has been removed. If we want to compare it for benchmarks, we can restore it in a separate binary crate dedicated for such benchmarks.
  9. The experimental search algorithm in CAKES that used search hints has been removed. It didn't have the performance I had hoped for.
  10. There is a new Vertex trait to extend Cluster for use in chaoda. The Vertex trait makes it so that there is no longer a global constant NUM_RATIOS restricting the number of properties that can be used for anomaly detection. Instead, the Vertex trait has an associated constant NUM_FEATURES and any type that implements Vertex can define its own number of features that will be used in CHAODA. I have not yet fully tested the actual CHAODA implementation for this and will target that in future PRs.
  11. Compressive search is poor for now. I'm designing new traits to make it better and will introduce them in a later PR. I am experimenting with some Encoder and Decoder traits and things look promising. The heart of the current issue is that compressed clusters store their own data (at the leaf level) instead of the compressed tree having an associated Dataset that has been compressed. This makes it impossible to shoehorn into the current design of the SearchAlgorithm trait in Cakes.

@nishaq503 nishaq503 changed the title feat: simplified lots of traits Simplified lots of traits Sep 22, 2025
feat: reviving python experiments

feat: implemented some combos for evaluating mbed

wip: removing metric trait

wip: lots of changes

wip: most tests pass

fix: repeated rnn now passes

wip: restored mbed and musals

simplified Dataset trait

wip: restored mbed

wip: restored CHAODA

wip: restored the shell

updating versions of deps

cleaned up deps

wip: cleanup up deps
@nishaq503

Copy link
Copy Markdown
Collaborator Author

I have something better in the works

@nishaq503 nishaq503 closed this Sep 29, 2025
@nishaq503 nishaq503 deleted the trait-simplification branch October 10, 2025 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant