fix: add depth limit to get_sample_values for nested struct columns#9383
Draft
weiguangli-io wants to merge 2 commits intomarimo-team:mainfrom
Draft
fix: add depth limit to get_sample_values for nested struct columns#9383weiguangli-io wants to merge 2 commits intomarimo-team:mainfrom
weiguangli-io wants to merge 2 commits intomarimo-team:mainfrom
Conversation
Deeply nested Polars Struct/List columns caused exponential blowup in dataset registration because `to_primitive()` recursively serialized without a depth cap. Add a MAX_NESTING_DEPTH=5 limit that falls back to `str()` for deeply nested values, preventing the pathological slowdown described in marimo-team#9378. Closes marimo-team#9378 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
for more information, see https://pre-commit.ci
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request was authored by a coding agent.
Summary
MAX_NESTING_DEPTH=5limit to theto_primitive()helper insideNarwhalsTableManager.get_sample_values(), preventing exponential blowup when serializing deeply nested Polars Struct/List columns during dataset registrationstr()for the remaining nested value instead of recursing furtherContext
Named Polars DataFrames with deeply nested struct columns become extremely slow when registered as datasets (#9378). The root cause is
to_primitive()recursively stringifying nested Python list/dict values without a depth cap, which becomes pathological for recursive struct/list payloads:With this fix, even depth 20+ completes in <1ms.
Test plan
get_sample_valuestests passCloses #9378
🤖 Generated with Claude Code