[Gem] Fixes Bulk Helper update with slice#2907
Open
gitviola wants to merge 1 commit intoelastic:mainfrom
Open
Conversation
|
💚 CLA has been signed |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This fixes a bug in
Elasticsearch::Helpers::BulkHelper#updatewhensliceis used.BulkHelper#updateis meant to split a set of documents into smaller batches and send one bulk update request per slice. In that sliced code path, it ended up passing already-built bulk action hashes back intoupdateinstead of the original documents. That produced malformed update payloads and also mutated the caller's input.When this happens
This only affects calls to
BulkHelper#updatethat use thesliceoption.For example:
Calls to
BulkHelper#updatewithoutsliceare not affected.What the broken behavior looked like
Before this change, the sliced path would:
updateas if they were still the original documentsOn that second pass,
updateexpects documents shaped like:but it was actually receiving hashes shaped like bulk actions.
That caused two problems:
_idbecamenilIn practice Elasticsearch would reject the request with a
400, for example:I reproduced this with a new test (now passing after the fix).
There was also an extra side effect: the helper removed
'id'from the caller's documents, so the input array was being mutated.Fix
This change keeps the sliced path working with the original documents and only builds the bulk update actions for each slice right before sending the request.
It also duplicates each document before removing
'id', so the caller's input is left unchanged.Validation
I added integration coverage for sliced bulk updates in the existing helper spec and verified that:
idvalues after the callI also ran:
bundle exec rake test:unitTEST_ES_SERVER='https://localhost:9200' ELASTIC_PASSWORD='changeme' bundle exec rspec spec/integration/helpers/bulk_helper_spec.rbTEST_ES_SERVER='https://localhost:9200' ELASTIC_PASSWORD='changeme' bundle exec rake test:all