Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically configure precision #145

Merged

Conversation

halfabrane
Copy link
Contributor

@halfabrane halfabrane commented Aug 11, 2017

There are use cases where within the same Spark run (ie with the same spark context) we want to join multiple different spatial datasets at different resolutions.
This PR allows you to pass in the resolution as a hint as follows:

import org.apache.spark.sql.magellan.dsl.expressions._
point.join(polygons index 15).where($"point" within $"polygon")

allows you to pass a hint to the optimizer that the index resolution should be set to 15.

also, modifies injectRules to pass in only the spark session and not the parameters.
ie. use injectRules(spark) instead of injectRules(spark, Map(...))

Future enhancements:

  • Ability to pass in a min/ max precision to greedily cover the geometries?
  • Ability to index geometries at the right precision on the fly?

@codecov-io
Copy link

Codecov Report

Merging #145 into master will increase coverage by 0.09%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #145      +/-   ##
==========================================
+ Coverage   85.47%   85.56%   +0.09%     
==========================================
  Files          47       48       +1     
  Lines        1363     1372       +9     
  Branches       96       96              
==========================================
+ Hits         1165     1174       +9     
  Misses        198      198
Impacted Files Coverage Δ
...main/scala/magellan/catalyst/SpatialJoinHint.scala 100% <100%> (ø)
src/main/scala/magellan/Utils.scala 100% <100%> (ø) ⬆️
src/main/scala/magellan/dsl/package.scala 84.61% <100%> (+4.61%) ⬆️
src/main/scala/magellan/catalyst/SpatialJoin.scala 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b8c595...37658ad. Read the comment docs.

@harsha2010
Copy link
Owner

@Perados can you give this a try when you get a chance? this might cover your use case nicely

@harsha2010 harsha2010 merged commit d13db84 into harsha2010:master Aug 12, 2017
@halfabrane halfabrane deleted the DYNAMICALLY-CONFIGURE-PRECISION branch August 14, 2017 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants