-
Notifications
You must be signed in to change notification settings - Fork 28.6k
SPARK-2298: Show stage attempt in UI #1384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can one of the admins verify this patch? |
To make this a bit more concise, what about having one column on the left whose header is |
/cc @rxin who is interested in this. |
@pwendell Thank you for your response. You mean like this? |
Hm the latest screenshot looks a little funky to me. Most stages will only have 1 attempt, so I think it makes sense to only show the attempt if this is not the first one. Something like:
|
@andrewor14 Thank you for your comment. |
@tsudukim The concept of TaskSet should be internal to Spark. Users shouldn't have to aware of task set. Users should only care about stage + attempt. |
@rxin OK, thanks. Then attempt id is still required in the web ui for users to know stage + attempt. Have I got that right? |
Yup - but let's avoid exposing the concept of TaskSet to users in the UI. That's only for internal engineering. |
@@ -478,6 +479,7 @@ private[spark] object JsonProtocol { | |||
|
|||
def stageInfoFromJson(json: JValue): StageInfo = { | |||
val stageId = (json \ "Stage ID").extract[Int] | |||
val attemptId = (json \ "Attempt ID").extract[Int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this backwards compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no it's not. How about this?
val attemptId = (json \ "Attempt ID").extractOpt[Int].getOrElse(0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a backwards compatible test in JsonProtocolSuite
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
@tsudukim @kayousterhout so I think in general here, our handling of stage re-submissions is broken in the UI. For instance, I looked in the |
@rxin is this something you've thought about in your various schedule refactoring things? |
@tsudukim I created a JIRA to deal with the broader issue. If you want to take that on as well, let me know: https://issues.apache.org/jira/browse/SPARK-2501 it might make sense to wrap it into this patch. |
@pwendell I agree that there are many room for improvement about handling of stageId and attemptId. It might be better to break this problems into some sub-tasks. I think this patch should be one of them. Or did you mean we should fix all of this problem in one patch? |
Let's hold off merging this one until we merge #1262. Then it will be easier to index the information based on stage + attempt. |
@rxin OK. After that, I think I can make this patch better. |
Added attempt ID column into stage page of webUI. Added attemptId handling code into StageInfo, JsonProtocol. Modified DAGScheduler to identify stages whose stageId is same but attemptId is different. Modified testcode for stage attempt ID.
Modified format of stageId and attemtId. Modified to backward compatible style. Added backward compatibility test.
Modified PR as your comments. thank you! |
It turned out much trickier than I thought to add attempt id. I submitted a PR here #1545 That PR already modifies the UI, since that's the only way I could test. |
i think we can add jobid to stageTable/UI. because jobid is very useful when a application has many jobs.that can distinguish every job's stages. |
@rxin Surely we can also fix them all in one patch. But it can be a little bit hard work to modify them compatibly in one patch so I just have thought to separate into several tasks and to make #1384 as first step by showing only attemptId to distinguish attempts. |
@lianhuiwang It appears to be a different problem to SPARK-2298. |
@tsudukim yes,SPARK-2298 is that i want to. but i think a simple way is on this PR add a jobid column to stage table.it is very easy to achieve it. |
I think this was ultimately fixed by #1545 so we can close this issue. But feel free to open another PR if that one did not fix this. |
Added attempt ID column into stage page of webUI.
Added attemptId handling code into StageInfo, JsonProtocol.
Modified DAGScheduler to identify stages whose stageId is same but attemptId is different.
Modified testcode for stage attempt ID.