Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(RedisMessageStore): RedisMessageStore add lock #9680

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

NaccOll
Copy link
Contributor

@NaccOll NaccOll commented Nov 23, 2024

I found that when RedisMessageStore adds and removes messages, it operates on two keys separately, which may cause problems in multi-threading due to non-atomic operations.

Although using redis to delay messages is not a good idea, the abnormal loss of messages in the logs alerted me when the number of requests was not large. By comparing the logs, I found the problem that the message group representing the metadata is not consistent with the actual message.

I discovered this problem at least three years ago, but not many people use redis as a message storage, and there is little feedback from the community. I solved this problem by inheriting RedisMessageStore in my project.

My current solution is to add a lock like SimpleMessageStore, which is also the method used in this pull request. After several years of production environment verification, this is feasible. But this is bound to bring some performance loss, which is why I have not yet initiated a pull request to spring-integrate. But now I think it is troublesome to duplicate this class between multiple projects, and not all project members are aware of this problem, so I hope to solve this problem at the framework level.

like #463

I found that when RedisMessageStore adds and removes messages, it operates on two keys separately, which may cause problems in multi-threading due to non-atomic operations.

Although using redis to delay messages is not a good idea, the abnormal loss of messages in the logs alerted me when the number of requests was not large. By comparing the logs, I found the problem that the message group representing the metadata is not consistent with the actual message.

A simple solution is to add lock like SimpleMessageStore, which is also the approach taken in this pull request. This will bring some performance loss, and I am not sure whether a configable switch is needed.
Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change somehow related to: #5123. At least by the title of the issue: the rest of the discussion went sideways.

Can we also think about moving this locking logic into a super class AbstractKeyValueMessageStore?
This way any impl would benefit immediately.

I also wonder if it is really safe to say that this is not a new feature which should go to the next 6.5 instead of making new bugs from existing...

Thank you for bringing this up!

Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, actually, it sounds more like we have to pull the lock usage up to the AbstractMessageGroupStore. So, all the impls would benefit.
Including that SimpleMessageStore refactoring I have mentioned.

Let me know if you are OK to work on all of that!
Otherwise I'll take it from here.
But again: that is going to be as part of the next 6.5 version.

@artembilan
Copy link
Member

So, I would say it is official: this PR is about to fix #5123.

@NaccOll
Copy link
Contributor Author

NaccOll commented Dec 3, 2024

@artembilan I will be working on this for a week or so. If still cannot meet the pull request by then, please you continue with the remaining work.

@artembilan
Copy link
Member

No problem , @NaccOll , take your time! We have not planned 6.5.0-M1 yet. But that is definitely somewhere next month or so.

Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool!
Please, consider to mention such an improvement in the message-store.adoc.
I'll come back to you for the whats-new.adoc change request as well, when I switch main to 6.5.0 - somewhere in the end of month or so. When we done with follow-up bug fixes for 6.4.x generation.

Thanks

@NaccOll
Copy link
Contributor Author

NaccOll commented Dec 22, 2024

@artembilan

In addition to the Lock mechanism, SimpleMessageStore also introduces a semaphore mechanism. My code will cause an error in the org.springframework.integration.store.SimpleMessageStoreTests#shouldWaitIfGroupCapacity test case, but I don’t know how to solve this problem. I look forward to your subsequent solution. My work to this location.

@@ -268,56 +266,37 @@ protected MessageGroup copy(MessageGroup group) {
}

@Override
public void addMessagesToGroup(Object groupId, Message<?>... messages) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is OK to revert the previous version of the method and just don't use doAddMessagesToGroup() contract.
This way that custom logic for the lock and UpperBound would be met and test will work again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code revert is ok, but this leads to the question of whether doAddMessagesToGroup is a common design?

@@ -162,6 +159,7 @@ public void setCopyOnGet(boolean copyOnGet) {
this.copyOnGet = copyOnGet;
}

@Override
public void setLockRegistry(LockRegistry lockRegistry) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this method since it is there in the super class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isUsed is the private variable of SimpleMessageStore.

@@ -56,11 +68,15 @@ public abstract class AbstractMessageGroupStore extends AbstractBatchingMessageG

private boolean timeoutOnIdle;

protected LockRegistry lockRegistry;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be private and whenever we use outside of this class has to be replaced by the protected getLockRegistry().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, even LockRegister does not need to be exposed to subclasses. But SimpleMessageStore is special, it needs lockRegister and has its own semaphore mechanism. I hope you can handle the follow-up work, because I feel there will be many special considerations in it. I will not update this PR again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants