Skip to content

Implement packing overflow and grouping#228

Open
voegtlel wants to merge 4 commits into
developfrom
feature/packing_overflow
Open

Implement packing overflow and grouping#228
voegtlel wants to merge 4 commits into
developfrom
feature/packing_overflow

Conversation

@voegtlel
Copy link
Copy Markdown
Collaborator

@voegtlel voegtlel commented May 6, 2026

Packing Overflow

Returning samples into the packing buffer when running select_samples_to_pack:

class MyTE(TaskEncoder):
    def select_samples_to_pack(
        self, samples: List[T_encoded_sample]
    ) -> list[list[T_encoded_sample]] | PackedSamplesOutput[T_encoded_sample]:
        ...
        return PackedSamplesOutput(
            packs=[samples[:10], samples[10:-10]],
            pushback=samples[-10:],
        )

Packing Grouping

In the MetadatasetV2, specify packing_group, all datasets with the same packing group will be packed in the same packing buffer.
New flow: [(blend) → (shuffle) → (select_samples_to_pack) → (postencode) → (pack_selected_samples)](per group) → (blend). If not grouping, this simplifies to (blend) → (shuffle) → (select_samples_to_pack) → (postencode) → (pack_selected_samples).

__module__: megatron.energon
__class__: MetadatasetV2
splits:
  train:
    blend:
      - path: ds1
        group: Alpha
      - path: ds2
        group: Alpha
      - path: ds3
        group: Beta
      - path: ds4

Would create three packing buffers:

  • Buffer Alpha: contains samples from ds1 + ds2
  • Buffer Beta: Only contains samples from ds3
  • Buffer None (=default): Contains samples from ds4

To configure the buffer sizes:

get_train_dataset(..., packing_buffer_size={"Alpha": 100, "Beta": None, None: 10}, shuffle_buffer_size={"Alpha": 100, "Beta": None, None: 10})

This would mean: Alpha has a packing buffer size of 100. Beta does not do packing. Default would have pbs=10. Same for shuffle buffer sizes.

@voegtlel voegtlel requested a review from philipp-fischer May 6, 2026 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant