[FLINK-37578] Fix distributed schema registry exposes bad internal state accidentally #3972

yuxiqian · 2025-03-28T08:34:35Z

There's a subtle sequential bug in both SchemaCoordinators after a schema evolve coordination process finishes. Coordinator may finish operators' blocking state first before restoring internal state properly, which may accidentally expose unwanted internal states or freeze the entire pipeline job.

@linjianchang's optimization in #3858 is actually correct, however it increases the chance of this glitch. Baked the original commit into this PR to test if it works well.

…ate accidentally

lvyanquan · 2025-04-21T04:09:46Z

Hi, @yuxiqian. This pr looks good to me.
But I believe that we lack the necessary e2e testing to expose such issues. Can you create a Jira to trace it？

yuxiqian · 2025-04-21T04:16:01Z

Thanks for @lvyanquan's suggestion, traced in FLINK-37704. Perhaps more test cases could be added based on changes in #3965.

leonardBang

Thanks @yuxiqian for the contribution, LGTM

…d internal state accidentally This closes apache#3972 Co-authored-by: linjc13 <linjc13@chinatelecom.cn>

github-actions bot added the runtime label Mar 28, 2025

[FLINK-37578] Fix distributed schema registry exposes bad internal st…

b0d0b2c

…ate accidentally

yuxiqian force-pushed the FLINK-37578 branch from b2763e1 to b0d0b2c Compare April 21, 2025 04:03

lvyanquan approved these changes Apr 21, 2025

View reviewed changes

github-actions bot added the reviewed label Apr 21, 2025

leonardBang approved these changes Apr 21, 2025

View reviewed changes

github-actions bot added the approved label Apr 21, 2025

leonardBang merged commit 4743399 into apache:master Apr 21, 2025
28 checks passed

linjianchang pushed a commit to linjianchang/flink-cdc that referenced this pull request May 16, 2025

[FLINK-37578][cdc-runtime] Fix distributed schema registry exposes ba…

2b8ae22

…d internal state accidentally This closes apache#3972 Co-authored-by: linjc13 <linjc13@chinatelecom.cn>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FLINK-37578] Fix distributed schema registry exposes bad internal state accidentally #3972

[FLINK-37578] Fix distributed schema registry exposes bad internal state accidentally #3972

Uh oh!

yuxiqian commented Mar 28, 2025

Uh oh!

lvyanquan commented Apr 21, 2025

Uh oh!

yuxiqian commented Apr 21, 2025

Uh oh!

leonardBang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[FLINK-37578] Fix distributed schema registry exposes bad internal state accidentally #3972

[FLINK-37578] Fix distributed schema registry exposes bad internal state accidentally #3972

Uh oh!

Conversation

yuxiqian commented Mar 28, 2025

Uh oh!

lvyanquan commented Apr 21, 2025

Uh oh!

yuxiqian commented Apr 21, 2025

Uh oh!

leonardBang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants