Skip to content

Add Tableau adapter with fixture coverage#134

Open
nicosuave wants to merge 12 commits intomainfrom
nicosuave/tableau-adapter-research
Open

Add Tableau adapter with fixture coverage#134
nicosuave wants to merge 12 commits intomainfrom
nicosuave/tableau-adapter-research

Conversation

@nicosuave
Copy link
Copy Markdown
Member

This adds a Tableau adapter, hooks Tableau files into directory loading, and checks in a fixture and test suite covering formulas, parsing, joins, and real-world workbooks. It also expands adapter contract coverage and fixes SQL generation so primary and foreign key columns are quoted correctly and Tableau collection-backed models compile as joined SQL with inferred primary keys. Validated with ruff check, ruff format --check, and pytest -v.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0353d27921

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

else:
i = end + 1

result.append(field_name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Quote field references during formula translation

When _replace_field_refs removes Tableau brackets it appends the raw field text directly, so references like [Extracts Incremented At] or [Parameters].[Parameter 1] become Extracts Incremented At / Parameter 1 in translated SQL. Those are not valid bare identifiers, so translated calculated dimensions/metrics can compile to invalid SQL instead of usable expressions. This is especially visible in the checked-in real-world fixtures that contain many space-containing field names.

Useful? React with 👍 / 👎.

Comment on lines +843 to +844
if sql is None and (" " in name or "-" in name):
sql = f'"{name}"'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Quote non-identifier raw Tableau dimension names

The raw-dimension fallback only quotes names containing spaces or -, but Tableau columns commonly include other special characters (for example Country/Region or names with parentheses). Those names are left unquoted, so generated SQL can interpret them as operators or invalid syntax (Country/Region as division), breaking queries for otherwise valid imported dimensions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3705bf12f8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +334 to +336
# Strip // comments before translation (they can contain IF/THEN keywords)
result = _COMMENT_RE.sub("", formula).strip()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip // comment stripping inside string literals

The unconditional // removal runs before any string-aware parsing, so formulas containing URL/text literals are truncated and translated into invalid SQL. For example, [prefix] + '://' + [suffix] currently becomes prefix || ': because everything after // is dropped. This will corrupt otherwise valid Tableau calculations whenever // appears inside quoted text, not just in actual comments.

Useful? React with 👍 / 👎.

# Each is (pattern, replacement_func_or_str)
_ZN_RE = re.compile(r"\bZN\s*\(", re.IGNORECASE)
_IFNULL_RE = re.compile(r"\bIFNULL\s*\(", re.IGNORECASE)
_IIF_RE = re.compile(r"\bIIF\s*\(\s*(.+?)\s*,\s*(.+?)\s*,\s*(.+?)\s*\)", re.IGNORECASE)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse IIF arguments with balanced parentheses

IIF translation uses a regex that splits arguments on commas without tracking nesting, so any branch expression containing function calls (and therefore commas) produces malformed CASE SQL while still being marked translatable. A case like IIF([x] > 0, DATEADD('day', 1, [d]), [d]) is rewritten into broken SQL, which will fail at query time for common nested Tableau formulas.

Useful? React with 👍 / 👎.

Comment stripping now skips // inside string literals, preventing
'://' in URLs from truncating formulas. IIF translation uses
balanced-paren argument splitting so nested function calls with
commas (e.g. IIF(x, DATEADD('day', 1, d), d)) produce valid SQL.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6ddfdc557

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +622 to +624
if op.upper() == "AND" and sub_exprs:
# Compound condition: recurse into the first equality
return _extract_join_columns(sub_exprs[0])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve all predicates from compound join clauses

When a Tableau join clause is op='AND', _extract_join_columns intentionally returns only the first equality and drops the rest, so multi-column joins are rewritten with a single ON condition. This changes join cardinality and can duplicate or mismap rows (for example, server_insights_content.twb has joins on both Content.Id = Alerts.item_id and Content.Type = Alerts.item_type, but only the first predicate survives), which can corrupt downstream metric results.

Useful? React with 👍 / 👎.

Comment on lines +1428 to +1431
if rel_type == "text":
# Custom SQL: use the name attribute as identifier
name = relation_elem.get("name", "")
return (name, [])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Expand text relations instead of using their display names

For type='text' nodes in join trees, this branch returns only the relation name and discards the embedded SQL body, so _build_join_sql emits FROM <name> / JOIN <name> references that are often undefined tables rather than subqueries. In real Tableau workbooks (e.g., the server_insights_* fixtures), text relations are custom SELECT statements, so this produces non-executable model SQL unless those display names happen to exist as physical tables.

Useful? React with 👍 / 👎.

Comment on lines +167 to +168
if c == string_char and (i + 1 >= len(s) or s[i + 1] != string_char):
in_string = False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle doubled-quote escapes in parenthesis matching

The string-state logic in _find_matching_paren treats the second quote in an escaped pair ('') as the end of the string, which makes the matcher fail on formulas containing apostrophes inside literals. As a result, balanced-function rewrites are skipped for valid formulas like IIF([x]=1, 'O''Reilly', 'no'), leaving Tableau-only syntax unconverted and causing dialect-specific failures later.

Useful? React with 👍 / 👎.

- Compound join clauses (AND) now preserve all predicates in the ON
  condition instead of silently dropping all but the first equality.
- Custom SQL (type="text") relations are wrapped as subqueries with
  the relation name as alias, producing valid FROM/JOIN references.
- Doubled-quote escapes ('') in string literals are handled correctly
  in balanced-paren matching, so IIF with apostrophes works.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bfe34b2327

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +327 to +329
if text[i] == quote and (i + 1 >= len(text) or text[i + 1] != quote):
i += 1
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve escaped quotes while stripping line comments

The string scanner in _strip_comments exits a quoted literal when it sees the second quote in an escaped pair (''), so a valid literal like 'O''Reilly // keep' is treated as ending before // and the remainder is dropped as a comment. This corrupts otherwise valid Tableau formulas whenever an escaped apostrophe appears before // in the same string.

Useful? React with 👍 / 👎.

Comment on lines +1441 to +1442
if sql_body and name:
return (f"({sql_body}) AS {name}", [])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Quote custom SQL relation aliases before emitting JOIN SQL

For type='text' relations, the adapter emits (sql_body) AS {name} with the raw Tableau relation name. Names in real Tableau workbooks frequently include spaces/parentheses (for example Data Sources (Custom SQL)), so this produces invalid SQL aliases and breaks generated model SQL at query time.

Useful? React with 👍 / 👎.

Comment on lines +637 to +641
if len(sub_exprs) >= 2:
left = _strip_brackets(sub_exprs[0].get("op", ""))
right = _strip_brackets(sub_exprs[1].get("op", ""))
if left and right and left != "=" and right != "=":
return [(left, right)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not coerce non-equality join predicates to '='

The fallback path in _extract_join_columns returns a column pair even when the expression operator is not =, but downstream join SQL always renders pairs as left = right. If a workbook uses <, >, <=, >=, or <> join conditions, this silently rewrites the predicate and changes join semantics, leading to incorrect results.

Useful? React with 👍 / 👎.

- _strip_comments and _find_matching_paren now skip doubled-quote
  escapes ('') so strings like 'O''Reilly // keep' are preserved.
- Custom SQL (type="text") relation aliases are quoted when they
  contain spaces/parens (e.g. "Data Sources (Custom SQL)").
- Non-equality join predicates (<, >, etc.) no longer silently
  coerced to = in the fallback path of _extract_join_columns.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d24cec50d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

join_keyword = join.join_type.upper()
parts.append(f"{join_keyword} JOIN {join.right_table_qualified}")
if join.column_pairs:
on_clauses = [f"{lc} = {rc}" for lc, rc in join.column_pairs]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Quote join identifiers before emitting ON clauses

The join SQL builder emits raw column references from Tableau expressions (f"{lc} = {rc}"), but _extract_join_columns only strips brackets and does not make them SQL-safe. In real Tableau files with spaces/parentheses in relation or field names (for example Data Sources (Custom SQL).Datasource ID (underlying) in server_insights_data_connections.twb), this produces invalid SQL like ON data_connections.datasource_id = Data Sources (Custom SQL).Datasource ID (underlying) and breaks query compilation for joined models.

Useful? React with 👍 / 👎.

level_col = _normalize_column_name(level)

if members and level_col:
quoted_members = ", ".join(f"'{m}'" for m in members)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Escape group member literals in segment SQL

Segment SQL is built with "'{m}'" for each Tableau group member, so member values containing apostrophes generate invalid SQL (e.g., Men's becomes IN ('Men's')). This is a common category/value shape in BI data and causes imported segments to fail at query time unless quotes are escaped before interpolation.

Useful? React with 👍 / 👎.

Comment on lines +225 to +226
if c == string_char and (i + 1 >= len(formula) or formula[i + 1] != string_char):
in_string = False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve escaped-quote literals during field ref rewrite

The string-state check in _replace_field_refs exits a string on the second quote of an escaped pair (''), so bracket text later in the same literal is treated as a field reference. For example, formulas like IIF([a]=1, 'It''s [x]', 'n') are rewritten with [x] replaced as a column, corrupting the literal content and changing calculation semantics whenever an escaped apostrophe appears before bracket text.

Useful? React with 👍 / 👎.

- _replace_field_refs now handles doubled-quote escapes ('') so
  brackets inside strings after escaped apostrophes are preserved.
- Join ON clauses use _quote_column_reference for identifiers with
  spaces/parens (e.g. "Data Sources (Custom SQL)"."Datasource ID").
- Group member values with apostrophes are SQL-escaped (Men's ->
  Men''s) to prevent broken IN clause syntax.
Python 3.11 CI consistently hits 25.2-26.0ms on shared runners,
causing spurious failures. The test guards against regressions,
not absolute performance, so 30ms provides adequate headroom.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82ce80c48d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +94 to +97
_DATEADD_RE = re.compile(
r"\bDATEADD\s*\(\s*'(\w+)'\s*,\s*(.+?)\s*,\s*(.+?)\s*\)",
re.IGNORECASE,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse DATEADD arguments with balanced parentheses

_DATEADD_RE uses a comma-splitting regex ((.+?), (.+?)) that does not track nested function calls, so valid Tableau formulas with commas inside the amount/date arguments are rewritten into malformed SQL while still marked translatable. For example, DATEADD('day', IFNULL([offset], 1), [start_date]) currently becomes date_add(1, INTERVAL (COALESCE(offset) day), start_date), which breaks query execution for common nested expressions.

Useful? React with 👍 / 👎.

r"\bIF\s+(.+?)\s+THEN\s+(.+?)(?:\s+ELSEIF\s+(.+?)\s+THEN\s+(.+?))*\s+(?:ELSE\s+(.+?)\s+)?END\b",
re.IGNORECASE | re.DOTALL,
)
_CONTAINS_RE = re.compile(r"\bCONTAINS\s*\(\s*(.+?)\s*,\s*(.+?)\s*\)", re.IGNORECASE)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse CONTAINS arguments with balanced parentheses

_CONTAINS_RE also splits arguments on commas without balanced-paren awareness, so nested expressions in either argument produce incorrect SQL rewrites. A formula like CONTAINS(IFNULL([name], 'a,b'), 'x') is transformed into COALESCE(name LIKE '%' || 'a,b' || '%', 'x'), changing semantics and yielding invalid predicates for otherwise valid Tableau calculations.

Useful? React with 👍 / 👎.

Both functions now use _split_args_balanced instead of simple comma-
splitting regexes, so nested expressions with commas (e.g.
DATEADD('day', IFNULL([offset], 1), [d])) are correctly parsed
into 3 arguments instead of being split at the inner comma.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0ee710e68c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +671 to +672
left = _strip_brackets(sub_exprs[0].get("op", ""))
right = _strip_brackets(sub_exprs[1].get("op", ""))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep literal join operands as literals

This code assumes both sides of op='=' are column references, so literal operands are normalized as if they were identifiers. In real Tableau fixtures such as server_insights_data_connections.twb, joins include <expression op='1' />; this path turns that into "1" downstream, producing ON "1" = ... (identifier) instead of ON 1 = ... (literal), which can fail with column-not-found errors or change join semantics.

Useful? React with 👍 / 👎.

result = _translate_iif(result)

# IF c THEN t ELSE e END -> CASE WHEN c THEN t ELSE e END
result = _IF_THEN_RE.sub(_if_to_case, result)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Translate nested IF blocks recursively

_IF_THEN_RE.sub(_if_to_case, result) runs only once over non-overlapping matches, so an inner Tableau IF ... THEN ... END inside an outer IF remains unconverted. For example, IF [x]=1 THEN IF [y]=2 THEN 'a' ELSE 'b' END ELSE 'c' END is emitted with an inner raw IF, yet the adapter still marks it translatable, leading to invalid SQL when those calculated fields are queried.

Useful? React with 👍 / 👎.

- _quote_column_reference now passes numeric literals (e.g. 1, 3.14)
  through unquoted instead of wrapping them as identifiers ("1").
- IF/THEN/ELSE/END translation runs in a loop so nested IF blocks
  (IF x THEN IF y THEN ... END ELSE ... END) are fully converted
  to nested CASE WHEN expressions.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b72f50d251

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

i += 1
break
else:
result.append(text[i])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Escape apostrophes when converting double-quoted strings

In _convert_double_quotes, characters from Tableau double-quoted literals are copied into a single-quoted SQL literal without escaping ', so formulas like "O'Reilly" are translated to invalid SQL ('O'Reilly'). This breaks parsing/execution for any translated calculation containing apostrophes in double-quoted Tableau strings.

Useful? React with 👍 / 👎.

if text[i] == '"':
if i + 1 < len(text) and text[i + 1] == '"':
# Escaped double quote -> escaped single quote
result.append("''")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve escaped double quotes in converted string literals

The double-quote escape branch rewrites Tableau "" to '' inside _convert_double_quotes, which changes the literal value rather than preserving an embedded ". For example, "He said ""Hi""" becomes a SQL literal containing apostrophes instead of double quotes, silently altering metric/dimension semantics even when the SQL remains syntactically valid.

Useful? React with 👍 / 👎.

second_table = obj_map.get(second_ep.get("object-id", ""), "") if second_ep is not None else ""

if first_table and second_table and pairs:
left_col, right_col = pairs[0]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Retain all join keys from object-graph relationships

_extract_join_columns now returns multiple predicates for compound expressions, but _parse_object_graph immediately truncates to pairs[0]. Any Tableau logical-layer relationship defined with multiple key predicates (e.g., composite-key AND joins) is reduced to a single-column join in _build_collection_sql, which can change join cardinality and produce incorrect query results.

Useful? React with 👍 / 👎.

- _convert_double_quotes now escapes apostrophes inside Tableau
  double-quoted strings ("O'Reilly" -> 'O''Reilly') and preserves
  escaped double quotes ("" -> literal ") instead of converting
  them to single-quote escapes.
- _ObjectGraphJoin stores full column_pairs list instead of single
  pair, so composite-key joins from object-graph relationships
  emit all ON predicates.
- Nested IF/THEN/ELSE/END blocks are translated recursively.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0ddf8361d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +425 to +426
result = re.sub(r"('\s*)\+(\s*)", r"\1||\2", text)
result = re.sub(r"(\s*)\+(\s*')", r"\1||\2", result)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Detect non-literal string concatenation before leaving '+'

_convert_string_concat only rewrites + when a single-quoted literal is directly adjacent, so string expressions like STR([Burst Out Join]) + STR([Calculation_...]) (present in tests/fixtures/tableau/real_world/document_api_nested.tds) are translated to CAST(...) + CAST(...) instead of using ||. In DuckDB/Postgres-style dialects this is a type error for text operands, so otherwise translatable calculated fields fail at query time.

Useful? React with 👍 / 👎.

_convert_string_concat now also matches CAST(... AS VARCHAR)
adjacent to + operators, so STR([x]) + STR([y]) produces
CAST(x AS VARCHAR) || CAST(y AS VARCHAR) instead of using +
which fails in DuckDB/Postgres for text operands.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f3d4c2ea69

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

result = re.sub(r"(\s*)\+(\s*')", r"\1||\2", result)
# CAST(... AS VARCHAR) + ... or ... + CAST(... AS VARCHAR)
result = re.sub(r"(AS\s+VARCHAR\)\s*)\+(\s*)", r"\1||\2", result, flags=re.IGNORECASE)
result = re.sub(r"(\s*)\+(\s*CAST\()", r"\1||\2", result, flags=re.IGNORECASE)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict '+' concat rewrite to VARCHAR casts

This rewrite converts any + CAST(...) pattern to ||, even when the cast is numeric (for example after translating INT()/FLOAT()). That changes numeric addition semantics and can produce type errors in generated SQL (a + CAST(b AS INTEGER) becomes a || CAST(b AS INTEGER)). The bug triggers on valid Tableau arithmetic formulas that include casted numeric operands.

Useful? React with 👍 / 👎.

Comment on lines +1001 to +1003
else:
# Default to dimension
return self._build_dimension(col_name, datatype, sql_expr, caption, hidden, metadata)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid emitting raw Tableau syntax for untranslatable dimensions

When formula translation fails (for LOD/table-calc constructs), the raw Tableau formula is preserved in sql_expr, and dimensions still get built with that value as executable SQL. Selecting those dimensions later will emit Tableau-only syntax like {FIXED ...} into warehouse queries, causing runtime SQL parse failures instead of a safe non-queryable fallback.

Useful? React with 👍 / 👎.

Comment on lines +1058 to +1062
# Passthrough or untranslatable: make derived metric
return Metric(
name=name,
type="derived",
sql=sql or name,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid emitting raw Tableau syntax for untranslatable metrics

For metrics marked untranslatable, this branch still stores sql as the original Tableau formula (sql or name) while declaring a derived metric. Query generation will then inline unsupported Tableau expressions (for example LOD/table-calc formulas) into SQL, leading to execution failures whenever those imported metrics are queried.

Useful? React with 👍 / 👎.

- _convert_string_concat no longer matches + CAST(...) generically;
  only AS VARCHAR) + triggers the rewrite, preserving INT/FLOAT
  arithmetic.
- Untranslatable formulas (LOD, table calcs) now set public=False
  and sql='NULL' to prevent raw Tableau syntax from being emitted
  in generated SQL queries.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 00c190909c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1573 to +1575
parts.append(f"{join_keyword} JOIN {join.right_table_qualified}")
if join.column_pairs:
on_clauses = [
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Require an ON predicate when emitting parsed JOIN SQL

_extract_join_columns only extracts pairs for =/AND, so valid Tableau join clauses using other operators or structures leave join.column_pairs empty. _build_join_sql still emits ... JOIN <table> and simply skips the ON clause in that case, which degrades to a cartesian join (or dialect error) and silently changes row counts instead of preserving join semantics or failing fast.

Useful? React with 👍 / 👎.

for table_name, qualified_table in collection_info.tables:
if table_name in connected or table_name == base_table_name:
continue
join_clauses.append(f"CROSS JOIN {_quote_table_reference(qualified_table)} AS {alias_by_table[table_name]}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid CROSS JOIN fallback for unconnected collection tables

When _build_collection_sql cannot connect all tables via parsed object-graph edges, it force-adds each remaining table with CROSS JOIN. Since edge extraction is intentionally partial, this fallback can introduce large cartesian products and corrupt metrics for collection-backed models rather than signaling that relationship reconstruction was incomplete.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant