Skip to content

Extend CS DSL support in compiler plugin #1079

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 28, 2025
Merged

Conversation

koperagen
Copy link
Collaborator

I made adapter layer, so most implementations in compiler plugin just delegate to core. So most of this PR are tests for all different overloads, making sure they work

@koperagen koperagen added this to the 0.16 milestone Feb 27, 2025
@koperagen koperagen self-assigned this Feb 27, 2025
@koperagen koperagen requested a review from zaleslaw February 27, 2025 12:05
@koperagen koperagen added the Compiler plugin Anything related to the DataFrame Compiler Plugin label Feb 27, 2025
@koperagen koperagen force-pushed the compiler-plugin-dev-2 branch from beb5c3e to c5680dd Compare February 27, 2025 12:05
@@ -61,3 +64,20 @@ public interface ColumnWithPath<out T> : DataColumn<T> {
}

public val <T> ColumnWithPath<T>.depth: Int get() = path.depth()

public fun ColumnWithPath(column: DataColumn<*>, path: ColumnPath): ColumnWithPath<*> =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate logic to BaseColumn<T>.addPath(path: ColumnPath). They should probably call each other if we want to keep both

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, thank you. I'll call addPath from ColumnWithPath

@Jolanrensen
Copy link
Collaborator

Wow really cool! :D That's so many supported functions :)

Does the compiler plugin need to understand the entire expression or can it also just understand a part of it? Like:

df.select { name and cols { false } }.name

}

interface ColumnsResolver {
interface ColumnsResolver : org.jetbrains.kotlinx.dataframe.columns.ColumnSet<Any?> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this not inherit org.jetbrains.kotlinx.dataframe.columns.ColumnsResolver<Any?> like the name suggests? I suspect this is gonna be confusing in the future (oh it's because I sealed it... oops... oh well, just have to be careful with instance checks for when some columnsResolver is a singlecolumn or columnset)

Copy link
Collaborator Author

@koperagen koperagen Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, i picked this name before i dig into CS DSL. Now it's kind of a legacy mode for compatibility with previously implemented functions, that needs to be refactored. After that i'd rename ColumnResolver (plugin) -> ColumnSetApproximation, and remove ColumnResolver (plugin) from supertypes of SingleColumnApproximation . Sounds good?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds great :)

@koperagen
Copy link
Collaborator Author

Does the compiler plugin need to understand the entire expression or can it also just understand a part of it? Like:

There's no such partial evaluation in place now

@koperagen koperagen force-pushed the compiler-plugin-dev-2 branch from c5680dd to 8497e3e Compare February 27, 2025 16:15
}

@DataSchema
interface Person2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes I suppose it's bad that some numerical types as Long/Double are not presented in this dataset

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, I suppose weight could be a double actually, that's pretty realistic

}
}

internal class AllAfter2 : AbstractInterpreter<ColumnsResolver>() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between AllAfter1/AllAfter2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CS DSL functions differ a little, but plugin implementation happened to be the same. You're right, i'll update it to use abstract class

val Arguments.receiver: ColumnsResolver by arg()
val Arguments.column: SingleColumnApproximation by arg()
override fun Arguments.interpret(): ColumnsResolver {
return columnsResolver { receiver.allBefore(column) }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it could be one class with the reference on function in some cases

Copy link
Collaborator Author

@koperagen koperagen Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cool! I tried to get the reference to allBefore, but it's strange.. Hard to get the reference to an extension function inside interface, plus we have ambiguity. So this doesn't compile

  public companion object {
        
        internal val AllColumnsSelectionDsl<*>.a: ColumnSelectionDsl<*>.(ColumnSelector<*, *>) -> ColumnSet<*> get() = ColumnsSelectionDsl<*>::allBefore
    }

}
private fun List<AnyCol>.mapBack(): List<SimpleCol> = map { it.asSimpleColumn() }

fun AnyCol.asSimpleColumn(): SimpleCol {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add some comment, when to use this function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

public fun <T> DataFrame<T>.xs(vararg keyValues: Any?): DataFrame<T> =
xs(*keyValues) {
colsAtAnyDepth { !it.isColumnGroup() }.take(keyValues.size)
}

@Refine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain, when we are using @refine and when are not using this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine is needed when fuction returns dataframe, datarow or groupby. It indicated that type parameter of the return type will be updated by the plugin. interpretable indicated more general situation for CS DSL for example, where plugin doesn't touch return type

Copy link
Collaborator

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please answer my comments, before or after merging

@koperagen koperagen merged commit 319191c into master Feb 28, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compiler plugin Anything related to the DataFrame Compiler Plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants