-
Notifications
You must be signed in to change notification settings - Fork 28.6k
use config spark.scheduler.priority for specifying TaskSet's priority on DAGScheduler #1528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
QA tests have started for PR 1528. This patch merges cleanly. |
QA results for PR 1528: |
QA tests have started for PR 1528. This patch merges cleanly. |
so it's actually another type of scheduling instead of FIFO/FAIR? |
also, this is preemptive or non-preemptive? according to my understanding on the code, it's non-preemptive, then a high priority TaskSet is easily to be delayed when there are a lot of last-long but low priority TaskSets |
QA tests have started for PR 1528. This patch merges cleanly. |
It add user defined priority to FIFO. If user do not configure priority, it work as before. It is non-preemptive.when there has free executors and pool is FIFO we can let high priority taskset's tasks firstly be submitted than lower priority taskset. |
taskScheduler.submitTasks( | ||
new TaskSet(tasks.toArray, stage.id, stage.newAttemptId(), stage.jobId, properties)) | ||
new TaskSet(tasks.toArray, stage.id, stage.newAttemptId(), stage.jobId, priority.toInt, | ||
properties)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
properties
is already being passed to the TaskSet
ctor, so I'd prefer that extraction of priority
happen there or elsewhere instead of doing properties.getProperty
here and adding another parameter to the TaskSet
ctor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, as DAGScheduler has known too much about task-level things......
QA tests have started for PR 1528. This patch merges cleanly. |
@markhamstra @CodingCat thank you for comments, i updated patch, please review again. |
@@ -17,6 +17,8 @@ | |||
|
|||
package org.apache.spark.scheduler | |||
|
|||
import scala.math.Ordering.Implicits._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling in these implicits can have unintended consequences; that's why in my previous comment I kept the scope of the import as small as possible.
QA tests have started for PR 1528. This patch merges cleanly. |
@markhamstra thank you. i update patch. have more comments? |
QA results for PR 1528: |
}else{ | ||
DEFAULT_PRIORITY | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the style checker ok with val priority = if (...) {...
instead of val priority = { if (...) {...
? If it is, I'd rather do without the extra {}
. You can also drop the : Int
from val DEFAULT_PRIORITY
and val priority
-- the types are obvious without the annotations. Also, I'm not sure that DEFAULT_PRIORITY really gains you anything -- I'd be fine with just if (...) {...} else 0
. And make sure you follow the style guide for spacing with parens and braces.
QA results for PR 1528: |
@markhamstra thank you.how about latest code? |
QA tests have started for PR 1528. This patch merges cleanly. |
This looks like a clean implementation, but you still need to open a JIRA issue to explain why you want this; then edit the description of this PR to reference that JIRA. https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingCode |
Sorry, looks like you already have SPARK-2618, so change change the title of this PR to include that. |
@markhamstra @pwendell i have updated SPARK-2618, please take a look. thanks |
QA results for PR 1528: |
en....I'm thinking that if we can achieve the same goal with FAIR scheduler.....my own answer is yes......@markhamstra your thoughts? |
i donot think priority is useful for FAIR scheduler. on YARN scheduler priority is work with FIFO and not with FAIR. so i think spark application's scheduler mode is same with YARN.we can see YARN's FAIR: |
I mean if we just want to prioritize some jobs, why not assigning them to a pool with higher weight? |
@CodingCat maybe i misunderstand you. with FAIR Schedulable.weight can replace priority. you mean with FAIR we can provide weight config to user? example:spark.scheduler.weight. if it is right , i think we can achieve it. |
Yeah, I'm wondering whether the actual problem is that creation and use of scheduler pools with different weights is unclear or too difficult; and that if we could resolve those issues, then the need for this PR would disappear. |
We shouldn't should expose these types of hooks into the scheduler internals. The TaskSet, for instance, is an implementation detail we don't want to be part of a public API and the priority is an internal concept. The public API of Spark for scheduling policies is the Fair Scheduler. Many different types of policies can be achieved within Fair Scheduling, including having a high priority pool to which tasks are submitted. |
the current implementation of scheduling is very ugly. so i cannot find space to add this config to complete job's priority.anyone can help me? |
Hey @lianhuiwang I'd prefer to close this issue and take the discussion about scheduling to the user list if you are not sure how to configure the scheduler to do what you want. Exposing internals like this to the user is not a great idea since these API's will likely change in the future. |
https://issues.apache.org/jira/browse/SPARK-2618