Skip to content

Robertk/versions #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 23, 2016
Merged

Robertk/versions #30

merged 2 commits into from
Sep 23, 2016

Conversation

robert3005
Copy link

No description provided.

@robert3005 robert3005 merged commit 7d0a22b into master Sep 23, 2016
@robert3005 robert3005 deleted the robertk/versions branch September 23, 2016 17:41
ash211 pushed a commit that referenced this pull request Feb 16, 2017
…it jars (#30)

* Revamp ports and service setup for the driver.

- Expose the driver-submission service on NodePort and contact that as
opposed to going through the API server proxy
- Restrict the ports that are exposed on the service to only the driver
submission service when uploading content and then only the Spark UI
after the job has started

* Move service creation down and more thorough error handling

* Fix missed merge conflict

* Add braces

* Fix bad merge

* Address comments and refactor run() more.

Method nesting was getting confusing so pulled out the inner class and
removed the extra method indirection from createDriverPod()

* Remove unused method

* Support SSL configuration for the driver application submission (#49)

* Support SSL when setting up the driver.

The user can provide a keyStore to load onto the driver pod and the
driver pod will use that keyStore to set up SSL on its server.

* Clean up SSL secrets after finishing submission.

We don't need to persist these after the pod has them mounted and is
running already.

* Fix compilation error

* Revert image change

* Address comments

* Programmatically generate certificates for integration tests.

* Address comments

* Resolve merge conflicts

* Fix bad merge

* Remove unnecessary braces

* Fix compiler error
mccheah added a commit that referenced this pull request Apr 27, 2017
…it jars (#30)

* Revamp ports and service setup for the driver.

- Expose the driver-submission service on NodePort and contact that as
opposed to going through the API server proxy
- Restrict the ports that are exposed on the service to only the driver
submission service when uploading content and then only the Spark UI
after the job has started

* Move service creation down and more thorough error handling

* Fix missed merge conflict

* Add braces

* Fix bad merge

* Address comments and refactor run() more.

Method nesting was getting confusing so pulled out the inner class and
removed the extra method indirection from createDriverPod()

* Remove unused method

* Support SSL configuration for the driver application submission (#49)

* Support SSL when setting up the driver.

The user can provide a keyStore to load onto the driver pod and the
driver pod will use that keyStore to set up SSL on its server.

* Clean up SSL secrets after finishing submission.

We don't need to persist these after the pod has them mounted and is
running already.

* Fix compilation error

* Revert image change

* Address comments

* Programmatically generate certificates for integration tests.

* Address comments

* Resolve merge conflicts

* Fix bad merge

* Remove unnecessary braces

* Fix compiler error
mattsills pushed a commit to mattsills/spark that referenced this pull request Jul 17, 2020
…enStageId

### What changes were proposed in this pull request?

Spark SQL's whole-stage codegen (WSCG) supports dumping the generated code to help with debugging. One way to get the generated code is through `df.queryExecution.debug.codegen`, or SQL `EXPLAIN CODEGEN` statement.

The generated code is currently printed without specific ordering, which can make debugging a bit annoying. This PR makes a minor improvement to sort the codegen dump by the `codegenStageId`, ascending.

After this change, the following query:
```scala
spark.range(10).agg(sum('id)).queryExecution.debug.codegen
```
will always dump the generated code in a natural, stable order. A version of this example with shorter output is:
```
spark.range(10).agg(sum('id)).queryExecution.debug.codegenToSeq.map(_._1).foreach(println)
*(1) HashAggregate(keys=[], functions=[partial_sum(id#8L)], output=[sum#15L])
+- *(1) Range (0, 10, step=1, splits=16)

*(2) HashAggregate(keys=[], functions=[sum(id#8L)], output=[sum(id)#12L])
+- Exchange SinglePartition, true, [id=palantir#30]
   +- *(1) HashAggregate(keys=[], functions=[partial_sum(id#8L)], output=[sum#15L])
      +- *(1) Range (0, 10, step=1, splits=16)
```

The number of codegen stages within a single SQL query tends to be very small, most likely < 50, so the overhead of adding the sorting shouldn't be significant.

### Why are the changes needed?

Minor improvement to aid WSCG debugging.

### Does this PR introduce any user-facing change?

No user-facing change for end-users; minor change for developers who debug WSCG generated code.

### How was this patch tested?

Manually tested the output; all other tests still pass.

Closes apache#27955 from rednaxelafx/codegen.

Authored-by: Kris Mok <kris.mok@databricks.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
(cherry picked from commit a177628)
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
16pierre pushed a commit to 16pierre/spark that referenced this pull request May 24, 2021
* Change policy

* Move policy bot file

* Add changelog

* Delete policy.yml

* Revert "Delete policy.yml"

This reverts commit cd84cbabc0ae9504a57907532d5d03867a7e926d.

* Delete changelog.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant