-
Notifications
You must be signed in to change notification settings - Fork 73
Predicate join operation #434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2a7be39
to
24c1749
Compare
I see you also did some stuff with HTML but I don't really understand what you did there. It seems to have added some extra spacing to each snippet file. Not too important but relevant enough to mention |
@@ -10,7 +10,9 @@ public fun <A, B> DataFrame<A>.join( | |||
other: DataFrame<B>, | |||
type: JoinType = JoinType.Inner, | |||
selector: JoinColumnsSelector<A, B>? = null | |||
): DataFrame<A> = joinImpl(other, type, true, selector) | |||
): DataFrame<A> { | |||
return joinImpl(other, type, addNewColumns = type.addNewColumns, selector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably provide name for each argument in the function, it looks odd like this :). Also, I'm fine with it being an expression function if it's just a one line thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it look odd too? plus, it's not even me who wrote this initially :D
joinImpl(
other = other,
type = type,
addNewColumns = type.addNewColumns,
selector = selector
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no not at all XD I write functions always like that. IMO positional arguments are a nightmare for refactoring and I think originally they were only allowed áfter the positional ones so mixing them looks extra odd to me.
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/join.kt
Outdated
Show resolved
Hide resolved
type: JoinType = JoinType.Inner, | ||
joinExpression: JoinExpression<A, B> | ||
): DataFrame<A> { | ||
return predicateJoinImpl(right, type, addNewColumns = type.addNewColumns, joinExpression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as in join, I'd make it an expression function and provide named arguments for all
public fun <A, B> DataFrame<A>.filterPredicateJoin( | ||
right: DataFrame<B>, | ||
joinExpression: JoinExpression<A, B> | ||
): DataFrame<A> = predicateJoinImpl(right, JoinType.Inner, addNewColumns = false, joinExpression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argument names :)
public fun <A, B> DataFrame<A>.predicateJoin( | ||
right: DataFrame<B>, | ||
type: JoinType = JoinType.Inner, | ||
joinExpression: JoinExpression<A, B> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing commas everywhere please :)
override val right: DataRow<B> = DataRowImpl(index1, rightOwner) | ||
} | ||
|
||
internal fun <A, B> DataFrame<A>.predicateJoinImpl( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments would be nice XD, bit hard to check it like this
core/src/test/kotlin/org/jetbrains/kotlinx/dataframe/testSets/person/PredicateJoinTests.kt
Outdated
Show resolved
Hide resolved
|
||
Joins two [`DataFrames`](DataFrame.md) by a join expression. | ||
|
||
```kotlin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code block contains non-kotlin code, so the highlighting goes wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these blocks use pseudocode intentionally everywhere we have them. So should be ok to use here as well. As for highlighting, with kotlin
at least valid parts are highlighted. Like interface :) There are no errors on the website that you see in the preview, if you're talking about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could also split it up in multiple blocks if you want pseudocode :)
Or do like I usually do:
// some pseudo code () -> Something
fun kotlinCode() {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think that will work, you can see that even inside kotlin code there are symbols like [ ] for optional parts or | for enumeration, etc
} | ||
``` | ||
|
||
Unlike [join](join.md) similar to one from relational databases, where relations between tables are based on IDs and usually equality between values, predicate join allows any operations that returns Boolean value. That gives you the ability to join data based on various logical operations and heuristics. This can also be useful when dealing with data from disparate sources that are connected, but lack common identifiers or structured relations typical of SQL tables with indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fold the line so I don't have to scroll horizontally to read it, haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also this sentence should probably be broken up into multiple and have some extra commas for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i rewrote it completely in the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better :) Just some small notes above
132a583
to
60c5da8
Compare
@@ -94,6 +94,7 @@ df.join(other) | |||
|
|||
Supported join types: | |||
* `Inner` (default) — only matched rows from left and right [`DataFrames`](DataFrame.md) | |||
* `Filter` — only matched rows from left [`DataFrame`](DataFrame.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably add filterJoin below to the joinSpecials as well
the name campaigns.innerJoin(visits).where {
right.date in startDate..endDate
} or in the join dsl: campaigns.innerJoin(visits) {
where { right.date in startDate..endDate }
// or
withPredicate {}
} |
I agree that
|
Yes, spacing changed because it changed in Kotlin sources. |
60c5da8
to
515d3fc
Compare
515d3fc
to
e8483ee
Compare
} | ||
``` | ||
|
||
Unlike [join](join.md) similar to one from relational databases, where relations between tables are based on IDs and usually equality between values, predicate join allows any operations that returns Boolean value. That gives you the ability to join data based on various logical operations and heuristics. This can also be useful when dealing with data from disparate sources that are connected, but lack common identifiers or structured relations typical of SQL tables with indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better :) Just some small notes above
@Jolanrensen Thank you! |
Main purpose: joining data on operations other than equals