You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched in the task list and found no similar tasks.
Mentor
I have sufficient knowledge and experience of this task, and I volunteer to be the mentor of this task to guide contributors to complete the task.
Skill requirements
Familiarize the integration process of Spark and Kyuubi engine plugins.
Understand the principles of collecting Spark JVM metrics.
Background and Goals
When facing out-of-control memory management in Spark engine, we typically use JVMkill as a remedy by killing the process and generating a heap dump for post-analysis. However, even with jvmkill protection, we may still encounter issues caused by JVM running out of memory, such as repeated execution of Full GC without performing any useful work during the pause time. Since the JVM does not exhaust 100% of resources, JVMkill will not be triggered.
So introducing JVMQuake provides more granular monitoring of GC behavior, enabling early detection of memory management issues and facilitating fast failure.
Implementation steps
Start the JVMQuake for the driver and executor through Spark plugins.
Collect GC metrics using JVMQuake.
Set rules for killing processes and specify the path for saving HeapDump.
Additional context
Custom Spark Plugin example:
package example
import org.apache.spark.api.plugin.{SparkPlugin, DriverPlugin, ExecutorPlugin}
class CustomExecSparkPlugin extends SparkPlugin {
override def driverPlugin(): DriverPlugin = {
new DriverPlugin() {
override def shutdown(): Unit = {
// custom code
}
}
}
override def executorPlugin(): ExecutorPlugin = {
new ExecutorPlugin() {
override def shutdown(): Unit = {
// custom code
}
}
}
}
No response
The text was updated successfully, but these errors were encountered:
Code of Conduct
Search before creating
Mentor
Skill requirements
Background and Goals
When facing out-of-control memory management in Spark engine, we typically use JVMkill as a remedy by killing the process and generating a heap dump for post-analysis. However, even with jvmkill protection, we may still encounter issues caused by JVM running out of memory, such as repeated execution of Full GC without performing any useful work during the pause time. Since the JVM does not exhaust 100% of resources, JVMkill will not be triggered.
So introducing JVMQuake provides more granular monitoring of GC behavior, enabling early detection of memory management issues and facilitating fast failure.
Implementation steps
Additional context
Custom Spark Plugin example:
No response
The text was updated successfully, but these errors were encountered: