Skip to content

Conversation

@a-cong
Copy link

@a-cong a-cong commented Dec 3, 2025

What problem does this PR solve?

Issue Number: close #12434

What is changed and how it works?

[[sink.dispatchers]]
matcher = ["source_db1.*"]
schema = "dest_db1"
table = "{table}"

[[sink.dispatchers]]
matcher = ["source_db2.*"]
schema = "source_db2"
table = "{table}"
  • The Dispatcher interface routes the given sourceSchema, sourceTable to targetSchema, targetTable. SinkRouter wraps the dispatchers
  • Added TargetSchema and TargetTable fields to TableName struct in cdc/model/sink.go, which are populated when the sink router is defined. This is handled in 2 ways - for tables that already exist when the changefeed is created/started, the fields are added to the existing snapshot of the schema. For tables that are created later, the fields are added to the new table before the new snapshot is saved
  • Use TargetSchema and TargetTable for the USE statement in mysql_ddl_sink.go, and in new method QuoteSinkString() which is called to construct mysql DMLs in pkg/sqlmodel/multirow.go and pkg/sqlmodel/row_change.go
  • Use FetchDDLTables and RenameDDLTable from the dm parser pkg to rewrite DDLs after they are constructed if the schema router is specified. FetchDDLTables returns all of the tables referenced in the DDL in the order that they appear in the DDL. We then use the schema router to map each source schema to its target schema and pass this list of tables to RenameDDLTable

Check List

Tests

  • Unit test
  • Manual test - tested in our local development environment

Questions

Will it cause performance regression or break compatibility?

No, the additional latency for DDL rewrites should be minimal compared to barrier ts, and there should not be compatibility issues since this is a new feature

Do you need to update user documentation, design documentation or monitoring documentation?

Yes, this is a new feature for mysql sinks, will need to update https://docs.pingcap.com/tidb/stable/ticdc-sink-to-mysql/

Release note

Support schema and table routing for mysql-compatible sinks by extending the `dispatchers` config

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 3, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 3, 2025

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

  1. It must be approved by the approvers firstly.
  2. AFTER it has been approved by approvers, please wait for the cherry-pick merging approval from triage owners.
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. affect-ticdc-config-docs Pull requests that affect TiCDC configuration docs. contribution This PR is from a community contributor. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Dec 3, 2025
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 3, 2025

Hi @a-cong. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 3, 2025

Welcome @a-cong!

It looks like this is your first PR to pingcap/tiflow 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/tiflow. 😃

@pingcap-cla-assistant
Copy link

pingcap-cla-assistant bot commented Dec 3, 2025

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot ti-chi-bot bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Dec 3, 2025
@a-cong a-cong force-pushed the sink-dispatcher-routing branch from 213bbd7 to 676be0e Compare December 4, 2025 20:21
sourceTables, err := dmparser.FetchDDLTables(defaultSchema, stmts[0], conn.LCTableNamesSensitive)
if err != nil {
log.Warn("failed to fetch source tables for sink routing, keeping original query",
zap.String("query", d.Query), zap.Error(err))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to fall back here with the warning or explicitly return an error?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use an explicit error to avoid data inconsistency.

if err != nil {
// If rewriting fails, keep the original query
log.Warn("failed to rewrite DDL for sink routing, keeping original query",
zap.String("query", d.Query), zap.Error(err))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements schema and table routing for MySQL-compatible sinks by extending the dispatchers configuration field, which was previously only used for MQ sinks (Kafka/Pulsar topic and partition routing). The implementation adds new TargetSchema and TargetTable fields to the TableName struct to maintain separation between source (for pullers) and target (for sinks) schemas and tables.

Key Changes:

  • Extended DispatchRule configuration with SchemaRule and TableRule fields supporting template expressions like {schema}, {table}, and static values
  • Implemented SinkRouter and Dispatcher abstractions for schema/table name transformation
  • Added DDL query rewriting using DM parser's FetchDDLTables and RenameDDLTable functions to maintain correct table reference ordering

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/config/sink.go Added SchemaRule and TableRule fields to DispatchRule, updated comments, added ValidateRoutingExpression function
pkg/config/replica_config.go Added validation for sink routing rules, ensuring they're only used with MySQL sinks
pkg/config/replica_config_test.go Added comprehensive tests for routing validation
cdc/sink/dispatcher/dispatcher.go Implemented Dispatcher interface and DynamicSchemaDispatcher with expression substitution
cdc/sink/dispatcher/dispatcher_test.go Added extensive unit tests for dispatcher transformations
cdc/sink/dispatcher/sink_router.go Implemented SinkRouter to manage routing rules with filter matching
cdc/sink/dispatcher/sink_router_test.go Added comprehensive tests for router behavior including case sensitivity
cdc/model/sink.go Added TargetSchema/TargetTable fields, QuoteSinkString() method, and DDL routing logic via applySinkRouting()
cdc/model/sink_test.go Added extensive tests for routing including DDL rewriting edge cases
cdc/entry/schema_storage.go Integrated SinkRouter, added routing application for both initial snapshots and new tables
cdc/entry/schema_storage_test.go Added integration tests verifying routing on pre-existing and newly created tables
cdc/entry/mounter_test.go Added tests verifying DML events have correct TargetSchema/TargetTable
pkg/sqlmodel/row_change.go Changed to use QuoteSinkString() instead of QuoteString() for SQL generation
pkg/sqlmodel/row_change_test.go Added comprehensive tests for schema, table, and combined routing in DML generation
pkg/sqlmodel/multirow.go Changed to use QuoteSinkString() for multi-row operations
pkg/sqlmodel/multirow_test.go Added tests for multi-row DML with routing
cdc/sink/ddlsink/mysql/mysql_ddl_sink.go Updated USE statement to use TargetSchema when set
cdc/model/changefeed.go Modified rmMQOnlyFields() to preserve dispatch rules with schema/table routing
cdc/api/v2/model.go Added SchemaRule and TableRule to API model's DispatchRule
cdc/processor/processor.go Integrated SinkRouter creation in DDL handler initialization
cdc/owner/changefeed.go Integrated SinkRouter creation in schema storage initialization
Various test files Updated function signatures to pass sinkRouter parameter to NewSchemaStorage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1008 to 1015
IndexName string `json:"index,omitempty"`
Columns []string `json:"columns,omitempty"`
TopicRule string `json:"topic,omitempty"`
SchemaRule string `json:"schema,omitempty"`
TableRule string `json:"table,omitempty"`
}
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment for DispatchRule is outdated and doesn't reflect the extended functionality. It says "represents partition rule for a table" but it now also handles schema/table routing for MySQL sinks. This should be updated to match the comprehensive comment in pkg/config/sink.go which correctly describes both MQ (topic/partition) and MySQL (schema/table routing) use cases.

Copilot uses AI. Check for mistakes.
@wk989898
Copy link
Collaborator

Could you provide a case to demonstrate this feature? It's better to add an integration test to cover it.

@wk989898
Copy link
Collaborator

/ok-to-test

@ti-chi-bot ti-chi-bot bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Dec 11, 2025
@a-cong a-cong force-pushed the sink-dispatcher-routing branch from 676be0e to acded41 Compare December 11, 2025 18:48
@ti-chi-bot ti-chi-bot bot added area/dm Issues or PRs related to DM. area/engine Issues or PRs related to Dataflow Engine. labels Dec 11, 2025
@a-cong
Copy link
Author

a-cong commented Dec 17, 2025

/retest

@a-cong a-cong force-pushed the sink-dispatcher-routing branch from 9a8a32d to fe61fe8 Compare December 17, 2025 20:28
@3AceShowHand 3AceShowHand added area/ticdc Issues or PRs related to TiCDC. and removed area/dm Issues or PRs related to DM. area/engine Issues or PRs related to Dataflow Engine. labels Dec 19, 2025
@@ -0,0 +1,27 @@
-- Test DML operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It suggested add more DDLs and DMLs mixed queries here, to cover more scenario.

@a-cong a-cong force-pushed the sink-dispatcher-routing branch from ca8c7ad to 769181d Compare December 19, 2025 17:58
@a-cong
Copy link
Author

a-cong commented Dec 19, 2025

/retest

@a-cong a-cong force-pushed the sink-dispatcher-routing branch 5 times, most recently from b594bd3 to d0fb962 Compare December 19, 2025 21:12
For TRUNCATE TABLE, job.TableID is the OLD table ID which no longer
exists in the snapshot after the DDL is applied. The snapshot only
contains the NEW table with a different ID from job.BinlogInfo.TableInfo.ID.

This fix ensures sink routing is applied to the correct (new) table ID
so that DML events after TRUNCATE are routed to the correct destination.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@a-cong a-cong force-pushed the sink-dispatcher-routing branch from d0fb962 to f94f22b Compare December 19, 2025 21:58
@wk989898
Copy link
Collaborator

/test all

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 23, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: 3AceShowHand, wk989898
Once this PR has been reviewed and has the lgtm label, please assign benmeadowcroft for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 23, 2025
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 23, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-12-15 13:57:23.304563268 +0000 UTC m=+1481388.118340840: ☑️ agreed by 3AceShowHand.
  • 2025-12-23 15:01:59.913559215 +0000 UTC m=+2176464.727336807: ☑️ agreed by wk989898.

@wk989898
Copy link
Collaborator

/retest

@wk989898
Copy link
Collaborator

@benmeadowcroft PTAL

@3AceShowHand
Copy link
Contributor

I think this PR should be merged into the master branch first, and then cherry-pick to the release-8.5 branch.

At the moment, it tries to merge into the release-8.5 directly, this is not standard operation.

@a-cong
Copy link
Author

a-cong commented Jan 5, 2026

I think this PR should be merged into the master branch first, and then cherry-pick to the release-8.5 branch.

At the moment, it tries to merge into the release-8.5 directly, this is not standard operation.

Oh, I thought we were not planning to merge this since we aren't upstreaming any more features in this repo - we will merge this change internally, but for upstreaming I will just merge this one when it's ready pingcap/ticdc#3704

@3AceShowHand
Copy link
Contributor

I think this PR should be merged into the master branch first, and then cherry-pick to the release-8.5 branch.
At the moment, it tries to merge into the release-8.5 directly, this is not standard operation.

Oh, I thought we were not planning to merge this since we aren't upstreaming any more features in this repo - we will merge this change internally, but for upstreaming I will just merge this one when it's ready pingcap/ticdc#3704

got it, so let's only review the PR in the ticdc repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affect-ticdc-config-docs Pull requests that affect TiCDC configuration docs. area/ticdc Issues or PRs related to TiCDC. contribution This PR is from a community contributor. do-not-merge/cherry-pick-not-approved do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. lgtm ok-to-test Indicates a PR is ready to be tested. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants