Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YARN-5924 - Resource Manager fails to load state with InvalidProtocolBufferException #164

Closed
wants to merge 3 commits into from

Conversation

ameks94
Copy link

@ameks94 ameks94 commented Nov 22, 2016

The solution is to catch "InvalidProtocolBufferException", show warning and remove application's folder that contains invalid data to prevent RM restart failure.

Additionally, I've added catch for other exceptions that can appear during recovering of the specific application, to avoid RM failure even if the only one application's state can't be loaded.

@ameks94
Copy link
Author

ameks94 commented Nov 28, 2016

Update PR to fix the checkstyle and whitespace tests failure.

@ameks94
Copy link
Author

ameks94 commented May 15, 2017

I realized that current solution is not good (to allow RM's launch even with broken app's data).
It's better to crash RM in case application's file with app's state is broken. This case we can specify more detailed information about which file is broken (Maybe to give the recommendation to remove application's folder with broken data to allow RM to be launched successfully)
Second, the most important part of the fix should be to find the reason of file's crashing and to find the way to prevent file's crash.

@ameks94 ameks94 closed this May 15, 2017
@ameks94 ameks94 deleted the YARN-5924 branch November 17, 2017 13:28
@ameks94 ameks94 restored the YARN-5924 branch November 17, 2017 13:28
This was referenced Aug 20, 2019
shanthoosh added a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Navina Ramesh<navina@apache.org>

Closes apache#164 from shanthoosh/fix-test-jmx-server-1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant