Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Exit on OutOfMemoryError #460

Closed
2 of 3 tasks
Tracked by #1092 ...
xianjingfeng opened this issue Jan 9, 2023 · 15 comments · Fixed by #1390
Closed
2 of 3 tasks
Tracked by #1092 ...

[Improvement] Exit on OutOfMemoryError #460

xianjingfeng opened this issue Jan 9, 2023 · 15 comments · Fixed by #1390
Labels
good first issue Good for newcomers

Comments

@xianjingfeng
Copy link
Member

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

At present, if OutOfMemoryError is encountered, server will not exit.

How should we improve?

Add -XX:+ExitOnOutOfMemoryError and -XX:+CrashOnOutOfMemoryError to JVM argments.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@xianjingfeng xianjingfeng added the good first issue Good for newcomers label Jan 9, 2023
@xianjingfeng
Copy link
Member Author

PTAL @jerqi @zuston @advancedxy

@jerqi
Copy link
Contributor

jerqi commented Jan 9, 2023

PTAL @jerqi @zuston @advancedxy

Do we need dump the memory when OOM occurs?

@xianjingfeng
Copy link
Member Author

PTAL @jerqi @zuston @advancedxy

Do we need dump the memory when OOM occurs?

The dump file maybe too big and disk space may be not enough to store it. So users should specify the path to store it.

@advancedxy
Copy link
Contributor

PTAL @jerqi @zuston @advancedxy

Do we need dump the memory when OOM occurs?

The dump file maybe too big and disk space may be not enough to store it. So users should specify the path to store it.

+1 from my side. oom dump should be an opt-in feature. The bin/start-shuffle-server.sh may handle that as a cli argument or env variable.

@zuston
Copy link
Member

zuston commented Jan 9, 2023

The bin/start-shuffle-server.sh may handle that as a cli argument or env variable.

Good idea. This should be supported.

@AdarshRawat1
Copy link

AdarshRawat1 commented Mar 19, 2023

Requesting permission to contribute to this feature. . Can you please advise on how I can get involved? Thanks

@jerqi
Copy link
Contributor

jerqi commented Mar 19, 2023

Requesting permission to contribute to this feature. . Can you please advise on how I can get involved? Thanks

I assigned this issue to you.
@AdarshRawat1 You can refer to https://github.com/apache/incubator-uniffle/blob/master/CONTRIBUTING.md
If you have any question, you can ask me.

@AdarshRawat1
Copy link

AdarshRawat1 commented Mar 19, 2023

Requesting permission to contribute to this feature. . Can you please advise on how I can get involved? Thanks

I assigned this issue to you. @AdarshRawat1 You can refer to https://github.com/apache/incubator-uniffle/blob/master/CONTRIBUTING.md If you have any question, you can ask me.

@jerqi Could you kindly provide me with additional resources and guidance [files to change, tests to perform], please?

@jerqi
Copy link
Contributor

jerqi commented Mar 20, 2023

1 files to change

https://github.com/apache/incubator-uniffle/blob/master/bin/start-coordinator.sh
https://github.com/apache/incubator-uniffle/blob/master/bin/start-shuffle-server.sh

2 test to perform
You build package and unzip it, set environment variable (JAVA_HOME and HADOOP_HOME) and run the cmd
sh bin/start-shuffle-server.sh and sh bin/start-coordinator.sh. If the process start successfully, the change is ok.

@jerqi
Copy link
Contributor

jerqi commented Apr 18, 2023

@AdarshRawat1 Do you need any other help from me?

@AdarshRawat1
Copy link

@AdarshRawat1 Do you need any other help from me?

@jerqi I’m currently occupied with my mid-term exams. Once I’m done with them, I’ll address this issue. I would appreciate your valuable guidance at that time. Thank you.

@jerqi
Copy link
Contributor

jerqi commented Aug 5, 2023

@AdarshRawat1 Do you need any other help from me?

@jerqi I’m currently occupied with my mid-term exams. Once I’m done with them, I’ll address this issue. I would appreciate your valuable guidance at that time. Thank you.

@AdarshRawat1 Do you want to continue this issue?

@AdarshRawat1
Copy link

@jerqi I deeply regret the delay and will commence work on it today. I look forward to receiving your guidance throughout the process.

@jerqi
Copy link
Contributor

jerqi commented Aug 7, 2023

@jerqi I deeply regret the delay and will commence work on it today. I look forward to receiving your guidance throughout the process.

OK. I will give some guidance.

@AdarshRawat1
Copy link

@jerqi I apologize, but I'm unable to work on this issue.

zuston pushed a commit that referenced this issue Dec 21, 2023
### What changes were proposed in this pull request?

Exit on OutOfMemoryError to specify JVM options 

### Why are the changes needed?

Fix: #460

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tested on production env
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants