Skip to content

[SPARK-47573][K8S] Support custom driver log url #45728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

EnricoMi
Copy link
Contributor

@EnricoMi EnricoMi commented Mar 26, 2024

What changes were proposed in this pull request?

Add config spark.ui.custom.driver.log.url to Kubernetes resource manager allow setting log url templates similar to spark.ui.custom.executor.log.url.

Allows for

--conf spark.ui.custom.driver.log.url="https://my-custom.url/api/logs?applicationId={{APP_ID}}

Supports these variables:

APP_ID: The unique application id
KUBERNETES_NAMESPACE: The namespace where the executor pods run
KUBERNETES_POD_NAME: The name of the pod that contains the executor
FILE_NAME: The name of the log, which is always "log"

Adds driver pod environment variable SPARK_DRIVER_POD_NAME.

Why are the changes needed?

Running Spark on Kubernetes requires persisting the logs elsewhere. Having the Spark UI link to those logs is very useful. This should be possible for driver logs similar to executor logs.

Does this PR introduce any user-facing change?

Spark UI provides links to logs when run on Kubernetes.

How was this patch tested?

Unit test and manually tested on minikube K8S cluster.

Was this patch authored or co-authored using generative AI tooling?

No

@EnricoMi
Copy link
Contributor Author

CC @dongjoon-hyun

@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch 2 times, most recently from e12ba58 to a3fe442 Compare March 27, 2024 12:15
@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch from 412e9d0 to 409ce35 Compare April 20, 2024 16:23
@EnricoMi
Copy link
Contributor Author

@dongjoon-hyun What do you think?

Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jul 30, 2024
@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch from 409ce35 to b0e1cde Compare July 30, 2024 08:07
@github-actions github-actions bot closed this Jul 31, 2024
@dongjoon-hyun
Copy link
Member

Sorry for being late. For the live log, can we try to use Spark Driver Live Log instead?

spark.driver.log.localDir=/tmp/logs

Apache Spark 4.0.0-preview1
Screenshot 2024-08-09 at 11 03 53 PM

@EnricoMi
Copy link
Contributor Author

@dongjoon-hyun thanks for the pointer! Is the live log supported by the Spark History Server? Or are they only available during runtime of the Spark app? Are they log collected? Executor logs are links to the log collections service but driver logs are not? Sounds different to the spark.ui.custom.executor.log.url approach.

@EnricoMi
Copy link
Contributor Author

Looks like the Spark History Server does not contain that live log tab. How is this then an equivalent replacement for the proposed driver log url?

This PR allows for the log link for the driver in the executors tab:
image

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Aug 14, 2024

It's not the same of course, @EnricoMi .

Looks like the Spark History Server does not contain that live log tab. How is this then an equivalent replacement for the proposed driver log url?

What I wrote is For the live log, can we try to use Spark Driver Live Log instead?. Nothing more.
Have you use Apache Spark's Driver log feature before, @EnricoMi ?

@EnricoMi
Copy link
Contributor Author

Haven't seen the live log feature before.

@dongjoon-hyun
Copy link
Member

Ya, maybe I made you confused.

  • I didn't review this PR yet.
  • I visited SPARK-49149 first and checked the code.
    • After that, I searched the PR list and see your executor log PR.
    • Then, it lead me to here (driver log PR).

The Dirver Live UI feature is only a part of the following at Spark 4.

Screenshot 2024-08-14 at 23 13 26

So, it is not relevant to this PR technically.

@EnricoMi
Copy link
Contributor Author

Thanks for clarification.

@dongjoon-hyun
Copy link
Member

Let me reopen this for now because it seems to be closed by the bot.

@dongjoon-hyun dongjoon-hyun reopened this Aug 15, 2024
@EnricoMi
Copy link
Contributor Author

@dongjoon-hyun any thoughts on this?

@EnricoMi
Copy link
Contributor Author

EnricoMi commented Sep 2, 2024

@dongjoon-hyun can we get this into the Spark 4.0.0 release?

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Sep 12, 2024

Let me check this PR quickly since I'm currently preparing Apache Spark 4.0.0-preview2, @EnricoMi .

@@ -220,8 +220,20 @@ private[spark] object UI {
.stringConf
.createOptional

val CUSTOM_DRIVER_LOG_URL = ConfigBuilder("spark.ui.custom.driver.log.url")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this doesn't have spark.kubernetes. prefix, this is supposed to work with YARN, @EnricoMi . Are you sure?

Copy link
Contributor Author

@EnricoMi EnricoMi Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just derived from spark.ui.custom.executor.log.url, where I suspect this should be similar. The property name does not mention Yarn, and documentation says that only K8s support this.

@@ -63,6 +63,9 @@ object Constants {
val UI_PORT_NAME = "spark-ui"

// Environment Variables
val ENV_DRIVER_ATTRIBUTE_APP_ID = "SPARK_DRIVER_ATTRIBUTE_APP_ID"
val ENV_DRIVER_ATTRIBUTE_KUBERNETES_NAMESPACE = "SPARK_DRIVER_ATTRIBUTE_KUBERNETES_NAMESPACE"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: we need this K8s namespace attribute additionally only for driver logs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

val driverCustomEnvs = KubernetesUtils.buildEnvVars(
Seq(ENV_APPLICATION_ID -> conf.appId) ++ conf.environment)
Seq(ENV_APPLICATION_ID -> conf.appId) ++ conf.environment ++ driverAttributes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this inevitable to inject both ENV_APPLICATION_ID and ENV_DRIVER_ATTRIBUTE_APP_ID? This looks like a redundancy for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same pattern as in

val attributes = if (kubernetesConf.get(UI.CUSTOM_EXECUTOR_LOG_URL).isDefined) {
Map(
ENV_EXECUTOR_ATTRIBUTE_APP_ID -> kubernetesConf.appId,
ENV_EXECUTOR_ATTRIBUTE_EXECUTOR_ID -> kubernetesConf.executorId)
} else {
Map.empty[String, String]
}
KubernetesUtils.buildEnvVars(
Seq(
ENV_DRIVER_URL -> driverUrl,
ENV_EXECUTOR_CORES -> execResources.cores.get.toString,
ENV_EXECUTOR_MEMORY -> executorMemoryString,
ENV_APPLICATION_ID -> kubernetesConf.appId,
// This is to set the SPARK_CONF_DIR to be /opt/spark/conf
ENV_SPARK_CONF_DIR -> SPARK_CONF_DIR_INTERNAL,
ENV_EXECUTOR_ID -> kubernetesConf.executorId,
ENV_RESOURCE_PROFILE_ID -> resourceProfile.id.toString)
++ attributes
++ kubernetesConf.environment
++ sparkAuthSecret
++ Seq(ENV_CLASSPATH -> kubernetesConf.get(EXECUTOR_CLASS_PATH).orNull)
++ allOpts) ++
KubernetesUtils.buildEnvVarsWithFieldRef(
Seq(
(ENV_EXECUTOR_POD_IP, "v1", "status.podIP"),
(ENV_EXECUTOR_POD_NAME, "v1", "metadata.name")
))

@@ -144,6 +144,11 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp
ThreadUtils.newDaemonSingleThreadScheduledExecutor("cleanup-decommission-execs")
}

override def getDriverLogUrls: Option[Map[String, String]] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of here, please override this at KubernetesClusterSchedulerBackend instead of CoarseGrainedSchedulerBackend because this PR aims for K8s only.

For now, only K8s cluster manager supports this configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

@dongjoon-hyun dongjoon-hyun removed their assignment Nov 13, 2024
@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch from 4609e87 to a09526c Compare December 2, 2024 14:38
@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch from a09526c to 188c968 Compare January 16, 2025 14:31
@dongjoon-hyun
Copy link
Member

Could you fix the compilation errors?

@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch from 188c968 to e0872a4 Compare January 20, 2025 13:52
@EnricoMi
Copy link
Contributor Author

@dongjoon-hyun fixed

@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch from e0872a4 to 9215c17 Compare March 5, 2025 13:19
"This configuration replaces original log urls in event log, which will be also effective " +
"when accessing the application on history server. The new log urls must be permanent, " +
"otherwise you might have dead link for executor log urls.")
.version("4.0.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use 4.1.0 since master branch is 4.1.0-SNAPSHOT?

spark/pom.xml

Line 29 in a30bdc3

<version>4.1.0-SNAPSHOT</version>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<p/>
For now, only K8s cluster manager supports this configuration.
</td>
<td>4.0.0</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto. Let's use 4.1.0 since master branch is 4.1.0-SNAPSHOT.

spark/pom.xml

Line 29 in a30bdc3

<version>4.1.0-SNAPSHOT</version>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@EnricoMi EnricoMi force-pushed the k8s-custom-driver-log-url branch from 9215c17 to aebcc62 Compare April 28, 2025 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants