Skip to content

Conversation

@water-flower30
Copy link

@water-flower30 water-flower30 commented Nov 23, 2025

Purpose

Linked issue: close #5955

What is the purpose of the change

Avoid blocking on listTables operation

ChangeLog

remove SyncDatabaseActionBase.buildEventParserFactory() catalog.listTables(database). it will list all catalog tables, which will result in a lot of time consumption and cost
Consumes unnecessary memory to maintain the createdTables set
Performs redundant operations when tables are created lazily

Tests

This optimization does not require additional test cases as the existing functionality is already covered by:

SyncDatabaseActionBaseTest.testSyncTablesWithoutDbLists() - validates table filtering logic
SyncDatabaseActionBaseTest.testSyncTablesWithDbList() - validates database filtering logic
SyncDatabaseActionBaseTest.testSycTablesCrossDB() - validates cross-database filtering scenarios
All these tests create and use RichCdcMultiplexRecordEventParser, ensuring the optimization doesn't break existing functionality.

When a table is lazily loaded, it will check for its existence, which will incur additional time consumption. Then waitJobRunning(client) method failure to obtain the Flink task status will result in test case errors. These test cases should add query timeout:
KafkaCanalSyncDatabaseActionITCase.testCaseInsensitive
KafkaOggSyncDatabaseActionITCase.testCaseInsensitive
MySqlSyncDatabaseActionITCase.testNewlyAddedTableSingleTable

@water-flower30
Copy link
Author

@JingsongLi PTAL,TBR,thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Optimize CDC sync database action to avoid blocking on listTables operation with large number of tables

1 participant