Description
Many functions in DataFrame have overloads, as you know. One of the common operations handled in a lot of functions is selecting columns. This happens in the df.update()
function, df.select()
, df.fillNulls()
etc. Overloads of these functions allow the user to give the columns they want to select in the following ways:
- vararg KProperty (https://kotlin.github.io/dataframe/kpropertiesapi.html)
Iterable of KProperty - vararg String of column names (https://kotlin.github.io/dataframe/stringapi.html)
- Iterable of String column names
- vararg Column reference (https://kotlin.github.io/dataframe/columnaccessorsapi.html)
- Iterable of Column reference
- The column selector DSL {} (includes all APIs above and https://kotlin.github.io/dataframe/extensionpropertiesapi.html)
However, not all operations seem to have all overloads at the moment:
For instance, select
does not have the Iterable
of KProperty
.
update
and fillNaNs
do not have Iterable
of KProperty
and Iterable
of Strings
.
and so every operation seems to have its own subset of these overloads.
Let's remove the Iterable overloads altogether and check the presence of the other (vararg) functions.
Of course, we need to provide easy ReplaceWith("this.select { columns.toColumnSet() }")
deprecations so we need some additions to the public API to make sure users of the Iterable functions can transition easily by adding:
Array<KProperty<T>>.toColumnSet(): ColumnSet<T>
Iterable<KProperty<T>>.toColumnSet(): ColumnSet<T>
Array<String>.toColumnSet(): ColumnSet<Any?>
- potentially
<T> Array<String>.toColumnSetOf(): ColumnSet<T>
Iterable<String>.toColumnSet(): ColumnSet<Any?>
- potentially
<T> Iterable<String>.toColumnSetOf(): ColumnSet<T>
Array<ColumnReference<T>>.toColumnSet(): ColumnSet<T>
Array<Iterable<T>>.toColumnSet(): ColumnSet<T>