Skip to content

fix(mcp): get_chart_sql drops x_axis on echarts_timeseries_* and only renders one query for mixed_timeseries#39865

Draft
aminghadersohi wants to merge 3 commits intoapache:masterfrom
aminghadersohi:aminghadersohi/fix-get-chart-sql-x-axis-mixed-timeseries
Draft

fix(mcp): get_chart_sql drops x_axis on echarts_timeseries_* and only renders one query for mixed_timeseries#39865
aminghadersohi wants to merge 3 commits intoapache:masterfrom
aminghadersohi:aminghadersohi/fix-get-chart-sql-x-axis-mixed-timeseries

Conversation

@aminghadersohi
Copy link
Copy Markdown
Contributor

Summary

Fixes two bugs in get_chart_sql introduced after #38700:

Bug 1 — x_axis dropped for echarts_timeseries_*:
For XY/timeseries charts the temporal column is stored in form_data["x_axis"], not in groupby. When _build_query_context_from_form_data built the query dict it only used groupby, so the time axis column was silently missing from the generated SQL.

Bug 2 — mixed_timeseries renders only one query:
mixed_timeseries has two independent series layers (primary: metrics/groupby, secondary: metrics_b/groupby_b). Previously only the primary query was built and passed to QueryContextFactory, so get_chart_sql returned SQL for only half the chart.

Changes

  • _extract_x_axis_col: new helper that normalises x_axis (plain string or adhoc column dict) to a bare column name.
  • _build_single_query_dict: extracted from an inner function so complexity stays within the C901 limit.
  • _build_mixed_timeseries_secondary: new helper that builds the secondary query dict from metrics_b / groupby_b.
  • _build_query_context_from_form_data: detects echarts_timeseries_* and mixed_timeseries viz types; prepends x_axis to columns; passes queries=[primary, secondary] for mixed_timeseries.
  • New TestExtractXAxisCol and TestBuildQueryContextTimeseriesAndMixed test classes.

Testing

pytest tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py -v

… renders one query for mixed_timeseries

- Add `_extract_x_axis_col` helper to normalise the x_axis field (plain
  string or adhoc column dict) into a bare column name.

- In `_build_query_context_from_form_data`, detect echarts_timeseries_*
  and mixed_timeseries viz types and prepend the x_axis column to the
  query's columns list so the temporal dimension appears in the generated
  SQL (was previously dropped because x_axis is stored separately from
  groupby in form_data).

- For mixed_timeseries, build a second query dict from metrics_b /
  groupby_b and pass queries=[primary, secondary] to QueryContextFactory.
  Previously only the primary query was built, so the secondary series SQL
  was silently lost.

- Add TestExtractXAxisCol and TestBuildQueryContextTimeseriesAndMixed test
  classes covering both bug fixes.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 8.33333% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.36%. Comparing base (dc1c0f6) to head (213fffd).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/chart/tool/get_chart_sql.py 8.33% 33 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #39865   +/-   ##
=======================================
  Coverage   64.35%   64.36%           
=======================================
  Files        2569     2569           
  Lines      134680   134713   +33     
  Branches    31254    31262    +8     
=======================================
+ Hits        86679    86710   +31     
  Misses      46505    46505           
- Partials     1496     1498    +2     
Flag Coverage Δ
hive 39.66% <8.33%> (?)
mysql 59.91% <8.33%> (-0.03%) ⬇️
postgres 59.99% <8.33%> (-0.03%) ⬇️
presto 41.41% <8.33%> (-0.02%) ⬇️
python 61.53% <8.33%> (+0.01%) ⬆️
sqlite 59.62% <8.33%> (-0.03%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes regressions in the MCP get_chart_sql tool’s query-context construction so that SQL rendering matches how Superset Explore builds queries for ECharts timeseries and mixed-timeseries charts.

Changes:

  • Add x_axis extraction and prepend it to query columns for echarts_timeseries_* and mixed_timeseries so the temporal axis is not dropped from rendered SQL.
  • Build and pass two query dicts for mixed_timeseries (primary + secondary) so both series layers’ SQL is returned.
  • Improve chart schema validation ergonomics (relax column-name regex constraints, add richer validation error formatting) and expand unit test coverage around these behaviors.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
superset/mcp_service/chart/tool/get_chart_sql.py Adds helpers to include x_axis in query columns and emits 2 queries for mixed_timeseries when building QueryContextFactory input.
superset/mcp_service/chart/validation/schema_validator.py Formats Pydantic validation errors into more actionable per-field details and deduped suggestions.
superset/mcp_service/chart/schemas.py Relaxes strict regex constraints on column-like fields (e.g., digit-prefixed / locale chars).
superset/mcp_service/chart/tool/generate_chart.py Expands tool docstring with additional request examples for several chart types.
tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py Adds unit tests for _extract_x_axis_col and for correct query building for ECharts timeseries + mixed timeseries.
tests/unit_tests/mcp_service/chart/test_chart_schemas.py Adds tests verifying relaxed column-name acceptance and continued blocking of obvious XSS/SQL-injection attempts (where applicable).
Comments suppressed due to low confidence (1)

superset/mcp_service/chart/schemas.py:785

  • FilterConfig.column removed the regex constraint, but sanitize_column still calls sanitize_user_input() without check_sql_keywords=True. As a result, values containing SQL metacharacters/comments (e.g. ';', '--', '/*') and dangerous SQL keywords are no longer rejected at validation time, which weakens the injection protections compared to the previous regex. Consider enabling check_sql_keywords=True (similar to ColumnRef.sanitize_name) so relaxed column naming stays compatible with locale chars while still blocking SQL injection patterns.
    column: str = Field(
        ...,
        min_length=1,
        max_length=255,
        # No regex pattern: sanitize_name() already blocks XSS/SQL injection;
        # many valid column names (digit-prefixed, locale chars, etc.) would
        # be rejected by a strict pattern while posing no security risk.
        # Use get_dataset_info to find exact column names.
        validation_alias=AliasChoices("column", "col"),
    )
    op: Literal[
        "=",
        ">",
        "<",
        ">=",
        "<=",
        "!=",
        "LIKE",
        "ILIKE",
        "NOT LIKE",
        "IN",
        "NOT IN",
    ] = Field(
        ...,
        description="LIKE/ILIKE use % wildcards. IN/NOT IN take a list.",
        validation_alias=AliasChoices("op", "operator", "opr"),
    )
    value: str | int | float | bool | list[str | int | float | bool] = Field(
        ...,
        description="For IN/NOT IN, provide a list.",
        validation_alias=AliasChoices("value", "val"),
    )

    @field_validator("column")
    @classmethod
    def sanitize_column(cls, v: str) -> str:
        """Sanitize filter column name to prevent injection attacks."""
        # sanitize_user_input raises ValueError when allow_empty=False (default)
        # so the return value is guaranteed to be a non-None str
        return sanitize_user_input(v, "Filter column", max_length=255)  # type: ignore[return-value]

Comment thread superset/mcp_service/chart/tool/get_chart_sql.py Outdated
Comment thread superset/mcp_service/chart/schemas.py
Comment thread tests/unit_tests/mcp_service/chart/tool/test_get_chart_sql.py Outdated
…mplexity

- _extract_x_axis_col: also handle adhoc-SQL x_axis dicts (expressionType:
  "SQL") by returning the full dict so the SQL expression is preserved in
  the query context; column_name dicts (expressionType: "SIMPLE") still
  return the bare column name string
- extract _build_single_query_dict and _build_mixed_timeseries_secondary
  as module-level helpers to reduce McCabe complexity of
  _build_query_context_from_form_data below the C901 limit
- add tests for SQL-expression x_axis: _extract_x_axis_col unit tests and
  an integration test through _build_query_context_from_form_data
- fix test docstring: regressions introduced after apache#38700, not apache#39636
@aminghadersohi aminghadersohi force-pushed the aminghadersohi/fix-get-chart-sql-x-axis-mixed-timeseries branch from afddf60 to 213fffd Compare May 4, 2026 19:30
@pull-request-size pull-request-size Bot added size/L and removed size/XL labels May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants