-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GeoDataFrame init #909
GeoDataFrame init #909
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good initial version :) just some comments here and there
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/GeoDataFrame.kt
Outdated
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/GeoDataFrame.kt
Outdated
Show resolved
Hide resolved
|
||
object Geocoder { | ||
|
||
private val url = "https://geo2.datalore.jetbrains.com/map_data/geocoding" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be included in the stock library? As far as I remember this was a reverse engineering endeavour and since it depends on an external URL behaviour might change or stop working entirely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this experimental module, I am totally fine with including in library.
- It's stable for enough time
- We not guarantee for a while all our releases of experimental geo-module
But we could also include this solution for the @Jolanrensen question:
- Checking the API availability and handling the unavailability with some exception
- Checking the API consistency and notification for the users or raise an exception
- Customize it with a URL parameter (if this API just will be moved in another place)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Geocoder is not part of the main API and will be removed/reworked completely (via future Kotlin geocoding library we discussed). I kept it, exclusively to play with it. So no effort should be wasted on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably fine then :) You could add a small note in KDocs too telling this is experimental/temporary, so potential users won't get used to it.
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/jupyter/IntegrationGeo.kt
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/io/read.kt
Outdated
Show resolved
Hide resolved
...eo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/geotools/toSimpleFeatureCollection.kt
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/geotools/ToGeoDataFrame.kt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I am approving these changes, but want to summarize some thoughts and make this piece of code more mature and maintanable
repositories { | ||
// geo repositories should come before Maven Central | ||
maven("https://maven.geotoolkit.org") | ||
maven("https://repo.osgeo.org/repository/release") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true, that these maven repo links will be hidden in the module, and the final user don't need to refer to them somehow in its Gradle script? (just for self checking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the user will need to add the repository. However, I believe we can fix this with shadow jar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check whether you can do that before this is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave it that way for now, then I'll try to get rid of it. When using a descriptor in notebook, the user won't need to do this.
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/jupyter/IntegrationGeo.kt
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/GeoFrame.kt
Outdated
Show resolved
Hide resolved
|
||
object Geocoder { | ||
|
||
private val url = "https://geo2.datalore.jetbrains.com/map_data/geocoding" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this experimental module, I am totally fine with including in library.
- It's stable for enough time
- We not guarantee for a while all our releases of experimental geo-module
But we could also include this solution for the @Jolanrensen question:
- Checking the API availability and handling the unavailability with some exception
- Checking the API consistency and notification for the users or raise an exception
- Customize it with a URL parameter (if this API just will be moved in another place)
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/GeoDataFrame.kt
Outdated
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/jts/geometryExtenstions.kt
Outdated
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/jts/geometryExtenstions.kt
Outdated
Show resolved
Hide resolved
} | ||
|
||
companion object { | ||
val DEFAULT_CRS = CRS.decode("EPSG:4326", true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be ENUM of different constants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it even exists in some form in GeoTools (see DefaultEngineeringCRS). But here is a more global question - whether it is necessary to use geotools API directly or should we wrap it completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest not wrap for now, we could not avoid the learning of that
|
||
fun GeoDataFrame<*>.writeGeoJson(path: String): Unit = writeGeoJson(File(path)) | ||
|
||
fun GeoDataFrame<*>.writeGeoJson(file: File) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add for IO part some JUnit tests for reading/writing files close to your examples https://gist.github.com/AndreiKingsley/8eada06f174c681e953f8d939248d153
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/geocode/Geocoder.kt
Outdated
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/jts/geometryExtenstions.kt
Outdated
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/io/read.kt
Outdated
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/GeoDataFrame.kt
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/GeoDataFrame.kt
Outdated
Show resolved
Hide resolved
dataframe-geo/src/main/kotlin/org/jetbrains/kotlinx/dataframe/geo/GeoDataFrame.kt
Show resolved
Hide resolved
} | ||
|
||
@get:JvmName("geometry") | ||
val <T : WithGeometry> ColumnsContainer<T>.geometry: DataColumn<Geometry> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the T
is not needed, right? ColumnsContainer<WithGeometry>
has the same effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, for the generated accessor we have both ColumnsContainer
and DataRow
extensions, so probably you should add DataRow<WithGeometry>.geometry: Geometry
etc. too :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koperagen could plugin help here, to generate all these extensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean the gradle plugin or the compiler plugin?
The compiler plugin is highly experimental still and it doesn't work yet in notebooks. So I wouldn't rely on that for now.
The gradle plugin, however, could generate these automatically based on the @DataSchema
interfaces you defined. You'd have to add the dataframe gradle plugin to the geo module, just like we did in core, and make sure the generated files are part of sources to they are published along with the rest of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sources generated by the plugin are automatically included. Makes sense to double check, but it's expected that you don't need to do anything extra. Just add this and it will generate required properties for annotated interfaces
plugins {
id("org.jetbrains.kotlinx.dataframe")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, forgot to add it, thank you!
Left a couple more comments :) Thanks for fixing some stuff already!
|
import org.jetbrains.kotlinx.dataframe.geo.WithGeometry | ||
import org.locationtech.jts.geom.Geometry | ||
|
||
fun SimpleFeatureCollection.toGeoDataFrame(): GeoDataFrame<*> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document it with KDocs, looks like it will not change too much in future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth to add in comment the summarization of the underlying algorithm
import org.jetbrains.kotlinx.dataframe.geo.GeoDataFrame | ||
import org.locationtech.jts.geom.Geometry | ||
|
||
fun GeoDataFrame<*>.toSimpleFeatureCollection( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a KDoc, with tiny explanation (especially if some edge cases are not working for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me it looks much more mature today, please fix comments/linter and it could be merged, I believe
# Conflicts: # settings.gradle.kts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job :D Only thing missing is a README.md in the module folder like:
## :dataframe-geo
This module, published as `dataframe-geo`, contains all logic and tests for DataFrame to be able to work with geographical data... etc.
So people looking through the codebase know what the folder is for. You also don't have the binary.compatibility.validator
gradle plugin yet, but I think that can wait for now, as it's still experimental. Otherwise I approve
No description provided.