-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Incorrect size in LocalStorageMeta
#1247
Comments
How to update? The updated value maybe not the consistent with the real disk usage. |
You are right. So i prefer to remove |
+1 Please go ahead |
@xianjingfeng @zuston Maybe the root cause is that we change the storage or storage manager in |
To solve this issue and make the storage selection logic more clear, may be we could try the below way
I'm working on this |
This may be just one of the cases. |
If LocalStageMeta is removed, how to support LocalStorage mode? |
Sorry, i can't get your point. |
I don't think ShuffleMetaMap is useless. I believe its intention is to track every shuffleId's shuffle size and last read time. It should be updated in a thread safe manner. Without this metadata, it would be hard to track how many data each shuffle has been wrote, data management operations such as: quota limit per app/shuffle, back pressure or traffic throttle would be impossible.
Currently, there's a capacity configuration for shuffle server. It is possible that the shuffle server's disk is shared by other services, although not recommended. So we cannot simply check if the disk has free space to determine whether it's writable. If we are going to remove LocalStorageMeta, I think we may have to add it back in the future to support advanced features. |
It's okay to keep it. Let's focus on how do we solve this problem and how can we avoid this problem from happening again in the future. |
Can we close this? @xianjingfeng |
I don't think it can be closed.The problem still exists, we just don't use this variable anymore. |
Code of Conduct
Search before asking
Describe the bug
I found some shuffle servers in our cluster write shuffle data to HDFS frequently. And then i found the size stored in
LocalStorageMeta
is incorrect.Maybe we should update the metrics in the following method. There may be other places that have been missed.
incubator-uniffle/server/src/main/java/org/apache/uniffle/server/storage/LocalStorageManager.java
Line 346 in b2154c7
Other suggestion:
shuffleMetaMap
inLocalStorageMeta
is useless, maybe we should remove it. And then we can removeLocalStorageMeta
at the same time.Affects Version(s)
master
Uniffle Server Log Output
No response
Uniffle Engine Log Output
No response
Uniffle Server Configurations
No response
Uniffle Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: