Skip to content

Hashset implemantation#2625

Open
takchiks wants to merge 2 commits intosuper30admin:masterfrom
takchiks:master
Open

Hashset implemantation#2625
takchiks wants to merge 2 commits intosuper30admin:masterfrom
takchiks:master

Conversation

@takchiks
Copy link

@takchiks takchiks commented Feb 3, 2026

No description provided.

@super30admin
Copy link
Owner

Strengths:

  • You have chosen a large prime number for HASH_SIZE, which helps in distributing keys uniformly.
  • The code is structured and easy to read.
  • You have considered the constraints and allocated sufficient space.

Areas for Improvement:

  1. The removal method is incorrect. When you remove a key by setting it to -1 and decrementing the count, the array now has a "hole" and the linear scans only go up to the count, so the -1 value is not checked in future contains calls. However, if you add the same key again, it might be added twice if there is a hole? Actually, the add method checks for existence only up to the count, so it might not see the -1 value. This leads to duplicates and incorrect state. Instead, you should overwrite the removed element with the last element in the bucket to maintain a contiguous array. For example:

    for (int i = 0; i < bucketCount[h]; i++) {
         if (buckets[h][i] == key) {
             buckets[h][i] = buckets[h][bucketCount[h]-1];
             bucketCount[h]--;
             return;
         }
    }
    

    This way, the array remains contiguous without holes.

  2. The add method does not handle the case when the bucket is full. Although the bucket size is large enough to avoid collisions in most cases, it is possible (though unlikely) for a bucket to exceed BUCKET_SIZE. Since the problem states at most 10^4 calls, and HASH_SIZE is 78727, the average chain length is very low. But to be safe, you should consider dynamic resizing or a different data structure for buckets (like ArrayList). However, given the constraints, it might not be necessary. But if you want to be correct, you should ensure that no bucket overflows.

  3. Alternatively, you can use a simpler approach: since the key range is only 0 to 10^6, you can use a boolean array of size 1000001. This would use about 1e6 booleans (1 MB) and all operations would be O(1). This is simpler and efficient.

  4. The current solution uses more memory than necessary. Consider using a more memory-efficient design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants