Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LIVY-544] Allow interpreterExecutor run in ThreadPool #135

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

WeiWenda
Copy link

@WeiWenda WeiWenda commented Dec 21, 2018

What changes were proposed in this pull request?

make interpreterExecutor run with newFixedThreadPool rather than newSingleThreadExecutor
https://issues.apache.org/jira/browse/LIVY-544

How was this patch tested?

we run a jmeter test of 20 threads each post statement 1/60s, set "spark.scheduler.mode": "FAIR"
we find setJobGroup is not thread-safe causing some reflect Error, so wrap it by Session.synchronized

this.synchronized{ setJobGroup(tpe, statementId) }

@WeiWenda WeiWenda changed the title Allow interpreterExecutor run in ThreadPool [LIVY-544] Allow interpreterExecutor run in ThreadPool Dec 21, 2018
@codecov-io
Copy link

Codecov Report

Merging #135 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #135      +/-   ##
============================================
+ Coverage     68.38%   68.44%   +0.05%     
  Complexity      891      891              
============================================
  Files           100      100              
  Lines          5596     5597       +1     
  Branches        839      839              
============================================
+ Hits           3827     3831       +4     
+ Misses         1223     1220       -3     
  Partials        546      546
Impacted Files Coverage Δ Complexity Δ
.../src/main/scala/org/apache/livy/repl/Session.scala 67.18% <100%> (ø) 37 <0> (ø) ⬇️
rsc/src/main/java/org/apache/livy/rsc/RSCConf.java 85.98% <100%> (+0.13%) 7 <0> (ø) ⬇️
...cala/org/apache/livy/scalaapi/ScalaJobHandle.scala 50% <0%> (-5.89%) 7% <0%> (ø)
...scala/org/apache/livy/repl/PythonInterpreter.scala 48.12% <0%> (+0.62%) 9% <0%> (ø) ⬇️
...ain/scala/org/apache/livy/utils/SparkYarnApp.scala 73.23% <0%> (+0.7%) 33% <0%> (ø) ⬇️
...main/scala/org/apache/livy/server/LivyServer.scala 35.51% <0%> (+1.63%) 9% <0%> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2d2189...01a9e0a. Read the comment docs.

@codecov-io
Copy link

codecov-io commented Dec 21, 2018

Codecov Report

Merging #135 into master will decrease coverage by 0.26%.
The diff coverage is 46.66%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #135      +/-   ##
============================================
- Coverage     68.82%   68.56%   -0.27%     
+ Complexity      906      905       -1     
============================================
  Files           100      100              
  Lines          5662     5688      +26     
  Branches        848      856       +8     
============================================
+ Hits           3897     3900       +3     
- Misses         1214     1232      +18     
- Partials        551      556       +5
Impacted Files Coverage Δ Complexity Δ
rsc/src/main/java/org/apache/livy/rsc/RSCConf.java 86.11% <100%> (+0.26%) 7 <0> (ø) ⬇️
...in/scala/org/apache/livy/repl/SQLInterpreter.scala 73.33% <100%> (+2.96%) 7 <1> (ø) ⬇️
.../src/main/scala/org/apache/livy/repl/Session.scala 65.32% <30%> (-1.87%) 37 <0> (ø)
...main/java/org/apache/livy/rsc/ContextLauncher.java 64.22% <33.33%> (-1.8%) 14 <0> (+1)
...src/main/scala/org/apache/livy/sessions/Kind.scala 72.72% <50%> (-2.28%) 2 <0> (ø)
...in/java/org/apache/livy/rsc/driver/JobWrapper.java 82.85% <0%> (-5.72%) 8% <0%> (-1%)
...ain/scala/org/apache/livy/utils/SparkYarnApp.scala 76.05% <0%> (-3.53%) 33% <0%> (ø)
...ala/org/apache/livy/scalaapi/LivyScalaClient.scala 83.33% <0%> (-3.34%) 7% <0%> (ø)
...ain/java/org/apache/livy/rsc/driver/RSCDriver.java 77.96% <0%> (-0.85%) 41% <0%> (ø)
...main/scala/org/apache/livy/server/LivyServer.scala 35.96% <0%> (+0.49%) 11% <0%> (ø) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5767789...936f554. Read the comment docs.

@Tagar
Copy link

Tagar commented Mar 12, 2019

@WeiWenda thanks for changing to concurrentSQL .

It seems all tests fail with

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

Is this a known issue? cc @vanzin

@WeiWenda WeiWenda closed this Mar 13, 2019
@WeiWenda WeiWenda reopened this Mar 13, 2019
@WeiWenda WeiWenda force-pushed the livy-544 branch 2 times, most recently from 78c1c90 to 07fdd14 Compare March 13, 2019 14:24
@WeiWenda
Copy link
Author

@WeiWenda thanks for changing to concurrentSQL .

It seems all tests fail with

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

Is this a known issue? cc @vanzin

I have fixed all errors. Can you merge this pr. Thank you.

@Tagar
Copy link

Tagar commented Mar 14, 2019

@WeiWenda I am not a committer.

Thanks for this great improvement. It would be awesome to have concurrentSQL
option in Livy, like we have in bare Zeppelin.

cc @mgaido91 @jerryshao @vanzin @ajbozarth @zjffdu - please help to review.

Thank you.

@ajbozarth
Copy link
Member

I see no issues with the code, but since I don't have any experience with the SQL side of Livy I will leave final review and merging to a more knowledgable committer

@@ -35,6 +35,8 @@ object Shared extends Kind("shared")

object SQL extends Kind("sql")

object ConcurrentSQL extends Kind("concurrentSQL")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it is a good idea to add a new session kind. I think we should rather have a configuration for enabling FAIR vs FIFO and I am wondering if this shouldn't be done cross kind, ie. for all the kinds: why was it done only for SQL?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgaido91 good point. There was some discussion on this at https://issues.apache.org/jira/browse/LIVY-544
For code paragraphs chances are that there is a dependency between code items, and they can't run in parallel. For SQL sometimes there is no dependency often when it's just a SELECT.. but we let end-users decide if there is truly no dependency and let queries execute in parallel only when it was requested explicitly. Also this was modeled after Zeppelin to some degree - Zeppelin only allows parallel execution for SQL cells and not code cells like pyspark.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run code block in parallel is very confusing. Spark‘s FAIR scheduler mode meaning that all runable stages can receive executors right away no matter when it was submitted. So that concurrentSQL’s stages need run into FAIR pool. But for scala or python or java, code block can't make stages in most time, it make no sense to let them run in FAIR pool.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I review 'SparkSqlInterperter' in zeppelin, it do the same thing as I have done.

NewSparkInterpreter

if (entry.getKey().toString().equals("zeppelin.spark.concurrentSQL")
            && entry.getValue().toString().equals("true")) {
          conf.set("spark.scheduler.mode", "FAIR");
        }

SparkSqlInterpreter

public InterpreterResult interpret(String st, InterpreterContext context)
      throws InterpreterException {
   ...
    sc.setLocalProperty("spark.scheduler.pool", context.getLocalProperties().get("pool"));

InterpreterContext's localProperties come from RemoteInterpreterContext's localProperties, which is RemoteInterpreter's InterpreterContext, created by Paragrah's getInterpreterContext, configured by %spark.sql(pool=poolname,key=value)

  private static Pattern REPL_PATTERN =
      Pattern.compile("(\\s*)%([\\w\\.]+)(\\(.*?\\))?.*", Pattern.DOTALL);

Copy link
Author

@WeiWenda WeiWenda Mar 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But different from zeppelin, I don't think user can provide spark's fairscheduler.xml in proper way especially in cluster deploy, and can keep in mind which sql should run into which pool. This need a great understand of spark scheduler. So I do all things in livy to release user from configuring above items. Of course we can modify to

{"code":"", "kind":"sql", "pool":"xx"}
//or
{"code":"set pool=xx, query content", "kind":"sql"}
//or
{"code":"", "kind":"sql?pool=xx"}

I have to say that expansibility of livy is not as excellent as zeppelin, there should be a properties part in statement boby.
image

@Tagar
Copy link

Tagar commented Apr 4, 2019

Would somebody else from committers be available to review this?
cc @jerryshao @vanzin @zjffdu

my 2 cents -
It's one of the most exciting Livy improvements among currently open PRs.
Since it's backward compatible and disabled by default, it seems fairly safe to add this in.

Thanks everyone.

@jcdauchy
Copy link

Hello,

With this feature, would it mean that we can run several statement at the same time on one Livy session ? Regarding of course, that each statement is independant from the other.

Thanks.

@WeiWenda
Copy link
Author

Hello,

With this feature, would it mean that we can run several statement at the same time on one Livy session ? Regarding of course, that each statement is independant from the other.

Thanks.

Yes, you can think so.

@jcdauchy
Copy link

Hope to find some time to test. Is this the way you use this feature ?

@mtNachiket
Copy link

mtNachiket commented Apr 21, 2021

@WeiWenda @mgaido91 When will this be merged as its been almost year and half old ? I have similar use case and this would solve my problem.

@mgaido91
Copy link
Contributor

this is not my main area of expertise, @vanzin and @jerryshao might be better reviewers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants