Deprecate deduplicate setting for SOBOL Generator; Set GenerationNode.should_deduplicate=True by default
#4887
+52
−403
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Previously, we had dedup settings on both
GeneratorandGenerationNodelevel, which is unnecessary and error-prone for accidentally using 2 different dedup settings at the 2 levels (see D92094386)Given that only
SobolGeneratorhas special dedup logic, we've decided to deprecatededuplicatesetting inGeneratorand only rely onGenerationNodelevel deduplication.Furthermore, since we generally expect dedup to be True for OSS use cases and AutoML use cases, we set
GenerationNode.should_deduplicate=Trueby default while explicitly overwriting GNode dedup to False for online use cases.One Potential Issue
Currently
RandomAdapter._gensolely relies on SOBOL generator to dedup (by passing ingenerated_pointsto sobol generator and thenrejection_sample- https://fburl.com/code/zmqfhn8g) and if we remove that, anyone that's using standaloneRandomAdapter(without a GNode) may get duplicated trials if they callRandomAdapter.genmultiple times. This is partially mitigated by advancinginit_positionon sobol generator -- there shouldn't be duplicated points for continuous parameters, but if there are discrete parameters we can be producing duplicated points after rounding. The recommended path is through GenerationStrategy/GenerationNode, where dedup is properly handled. But there's no enforcement on never using standalone RandomAdapter. (ps.TorchAdapterpasses pending_observations through to the BoTorch acquisition function as X_pending so it doesn't have the same problem)Differential Revision: D92884352