Skip to content

Fix/field ordering bug#8

Merged
nicosuave merged 2 commits intosidequery:mainfrom
hentzthename:fix/field-ordering-bug
Jan 25, 2026
Merged

Fix/field ordering bug#8
nicosuave merged 2 commits intosidequery:mainfrom
hentzthename:fix/field-ordering-bug

Conversation

@hentzthename
Copy link
Copy Markdown
Contributor

Hi Nico, I think I found a bug while using this package.

Problem

When loading JSON data (where field order isn't guaranteed), subsequent loads could fail with:

ValueError: Target schema's field names are not matching the table's field names: ['a', 'b', 'c'], ['c', 'b', 'a']

This occurs because PyArrow's table.cast() matches fields by position, not by name. Fails even though all field names and types match.

Solution

Reorder source table columns to match target schema order before casting

Changes

  • tests/test_schema_casting.py: Add test for field ordering
  • src/dlt_iceberg/schema_casting.py: Add column reordering before table.cast() in cast_table_safe()

Add regression test that demonstrates the field ordering bug where
cast_table_safe fails when source table fields are in a different
order than the target schema, even when all field names and types match.
Reorder source table columns to match target schema order before
casting. PyArrow's cast() matches fields by position, not name,
so tables with different field ordering would fail even when all
field names and types matched.
@nicosuave
Copy link
Copy Markdown
Member

Much appreciated @hentzthename

@nicosuave nicosuave merged commit ebcccd5 into sidequery:main Jan 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants