Skip to content

[Gem] Fixes Bulk Helper update with slice#2907

Open
gitviola wants to merge 1 commit intoelastic:mainfrom
gitviola:bulk-helper-sliced-update
Open

[Gem] Fixes Bulk Helper update with slice#2907
gitviola wants to merge 1 commit intoelastic:mainfrom
gitviola:bulk-helper-sliced-update

Conversation

@gitviola
Copy link
Copy Markdown

@gitviola gitviola commented Mar 25, 2026

Summary

This fixes a bug in Elasticsearch::Helpers::BulkHelper#update when slice is used.

BulkHelper#update is meant to split a set of documents into smaller batches and send one bulk update request per slice. In that sliced code path, it ended up passing already-built bulk action hashes back into update instead of the original documents. That produced malformed update payloads and also mutated the caller's input.

When this happens

This only affects calls to BulkHelper#update that use the slice option.

For example:

bulk_helper.update(docs, slice: 100)

Calls to BulkHelper#update without slice are not affected.

What the broken behavior looked like

Before this change, the sliced path would:

  1. Build the bulk update actions
  2. Split those actions into slices
  3. Feed those action hashes back into update as if they were still the original documents

On that second pass, update expects documents shaped like:

{ 'id' => '123', 'field' => 'value' }

but it was actually receiving hashes shaped like bulk actions.

That caused two problems:

  • the generated _id became nil
  • the bulk update payload was wrapped a second time incorrectly

In practice Elasticsearch would reject the request with a 400, for example:

[400] {"error":{"root_cause":[{"type":"action_request_validation_exception","reason":"Validation Failed: 1: id is missing;"}],"type":"action_request_validation_exception","reason":"Validation Failed: 1: id is missing;"},"status":400}

I reproduced this with a new test (now passing after the fix).

There was also an extra side effect: the helper removed 'id' from the caller's documents, so the input array was being mutated.

Fix

This change keeps the sliced path working with the original documents and only builds the bulk update actions for each slice right before sending the request.

It also duplicates each document before removing 'id', so the caller's input is left unchanged.

Validation

I added integration coverage for sliced bulk updates in the existing helper spec and verified that:

  • sliced updates succeed
  • the expected documents are updated
  • the caller's documents still retain their id values after the call

I also ran:

  • bundle exec rake test:unit
  • TEST_ES_SERVER='https://localhost:9200' ELASTIC_PASSWORD='changeme' bundle exec rspec spec/integration/helpers/bulk_helper_spec.rb
  • TEST_ES_SERVER='https://localhost:9200' ELASTIC_PASSWORD='changeme' bundle exec rake test:all

  • Contributor Agreement signed

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service bot commented Mar 25, 2026

💚 CLA has been signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant