fix(mcp): ASCII chart crashes with NaN when dataset contains null values#39916
fix(mcp): ASCII chart crashes with NaN when dataset contains null values#39916aminghadersohi wants to merge 7 commits intoapache:masterfrom
Conversation
When a dataset contains NULL values they become float NaN after numeric conversion. The horizontal and vertical bar chart renderers passed NaN into int(), raising ValueError. Likewise the line chart extractor would include NaN in the value list, corrupting min/max calculations. Three-layer fix: 1. Filter NaN during value extraction in _generate_ascii_bar_chart and _extract_time_series_data so NaN rows are silently skipped. 2. Add _is_nan_value() guard before int() in _generate_horizontal_bar_chart (bar_length) and _generate_vertical_bar_chart (bar_height) as defence-in-depth; NaN bar size is treated as 0 (no bar drawn). Fixes: ValueError: cannot convert float NaN to integer
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #39916 +/- ##
==========================================
- Coverage 63.88% 63.87% -0.02%
==========================================
Files 2583 2583
Lines 136604 136629 +25
Branches 31502 31506 +4
==========================================
Hits 87276 87276
- Misses 47812 47837 +25
Partials 1516 1516
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Vertical bar chart (chosen when avg label length <= 8) truncates labels to 3 characters, so 'Alpha' never appears literally. Use longer labels (> 8 chars average) to force horizontal layout, where the full label is preserved up to 15 characters.
There was a problem hiding this comment.
Pull request overview
Fixes MCP ASCII chart preview rendering crashes when datasets contain NULLs that become NaN during numeric conversion, by filtering NaN values during data extraction and guarding the int() bar-size conversion paths. Adds unit tests to exercise bar/column/line/timeseries behavior with NaN and None inputs.
Changes:
- Filter out
NaNnumeric values during bar and time-series data extraction. - Add defense-in-depth checks to avoid
int(float('nan'))when computing bar sizes. - Add unit tests covering chart generation with
NaN/Nonevalues.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
superset/mcp_service/chart/ascii_charts.py |
Filters NaN during extraction and adds guards before converting normalized values to int() to prevent crashes. |
tests/unit_tests/mcp_service/chart/test_ascii_charts.py |
Adds unit tests for NaN/None handling across several ASCII chart types. |
Assert the exact fallback message 'No numeric data found for bar chart' instead of just checking len(result) > 0, which would pass even on unrelated error strings.
- test_horizontal_bar_chart_nan_rows_are_skipped: replace weak `or` assertion with separate `assert "Alpha" in result`, `assert "Gamma" in result`, and `assert "Beta" not in result`, making the test deterministic and verifying the NaN row is actually excluded - test_bar_chart_with_all_null_values_returns_fallback: add `isinstance(result, str)` and `assert "█" not in result` to explicitly verify no bar content is rendered in the fallback path
Add `assert "Horizontal Bar Chart" in result` to make it explicit that the horizontal renderer is chosen (avg label length 14 > 8 threshold). Horizontal mode preserves full label text; vertical would truncate to 3 chars, invalidating the "Alpha"/"Gamma" presence assertions.
…derers bool is a subclass of int, so isinstance(True, (int, float)) returns True. Without an explicit bool guard the extractor would lock onto a boolean column (is_active, flag, etc.) and ignore the real numeric metric, producing 0/1 bars instead of actual values. Add not isinstance(val, bool) to the numeric guard in both _generate_ascii_bar_chart and _extract_time_series_data.
Code Review Agent Run #f8215aActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
…trengthen tests - Replace old `val != val` NaN idiom and add bool guard in _extract_scatter_data to match the pattern used in bar/line extractors - Add scatter chart NaN and boolean test coverage - Remove redundant bar fallback assertion; strengthen bool-skip negative check
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
SUMMARY
When a dataset contains NULL values, they are represented as
float('nan')after numeric conversion. The ASCII bar chart renderers passed NaN directly intoint(), raisingValueError: cannot convert float NaN to integer.Root cause:
_generate_ascii_bar_chartand_extract_time_series_dataacceptedfloat('nan')values through theisinstance(val, (int, float))check, which returnsTruefor NaN. The NaN then propagated intomax()/ normalization calculations, causing the crash atint(normalized * max_bar_width).Three-layer fix:
_generate_ascii_bar_chartand_extract_time_series_datanow call_is_nan_value()(already defined in the module) before accepting a numeric value, silently skipping NaN rows._generate_horizontal_bar_chartand_generate_vertical_bar_chartcheck_is_nan_value(normalized)before theint()call, treating NaN bar size as 0 (no bar drawn).BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — ASCII chart output; crash prevented.
TESTING INSTRUCTIONS
get_chart_previewtool.ValueError: cannot convert float NaN to integer.Unit tests added in
tests/unit_tests/mcp_service/chart/test_ascii_charts.pycovering bar, column, line, and timeseries bar chart types with NaN andNonevalues.ADDITIONAL INFORMATION