Skip to content

Adds the parent name in the flatten operation #378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 19, 2023

Conversation

zaleslaw
Copy link
Collaborator

Fixes #229

@zaleslaw zaleslaw requested a review from Jolanrensen May 16, 2023 13:41
@zaleslaw zaleslaw added this to the 0.11.0 milestone May 16, 2023
@Jolanrensen
Copy link
Collaborator

Don't forget to regenerate docs, any build/assemble call will do so.

Maybe it's a good idea to make this behavior optional? I can see many cases where flattening should result in all child columns with their actual name, not including the parent name.

@Jolanrensen Jolanrensen added the enhancement New feature or request label May 16, 2023
@zaleslaw
Copy link
Collaborator Author

Could you please share the case, I will add it to the tests and could make ot optional

@Jolanrensen
Copy link
Collaborator

Just take any Dataframe containing some column groups where the children (value- or frame-columns) names are unique and flatten it. This should produce a new data frame with all value- and frame-columns regardless of whether they were nested in a column group or not

@zaleslaw
Copy link
Collaborator Author

In SQL, it's done by default, you know all this stuff: "MAX (name)" and so on.
Ok, I will add at Friday some examples and make it optional

@zaleslaw zaleslaw merged commit 3101982 into Kotlin:master May 19, 2023
@Jolanrensen
Copy link
Collaborator

@zaleslaw "keepParentNameForColumns" is a bit long isn't it? Also, "forColumns" is a bit redundant IMO. I'd consider something like "renameByPath".

@zaleslaw zaleslaw removed this from the 0.11.0 milestone Jun 22, 2023
@@ -29,7 +30,8 @@ internal fun <T, C> DataFrame<T>.flattenImpl(
.into {
val targetPath = getRootPrefix(it.path).dropLast(1)
val nameGen = nameGenerators[targetPath]!!
val name = nameGen.addUnique(it.name())
val preferredName = if (keepParentNameForColumns) "${it.name()}.${it.parentName}" else it.name()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the wrong way around, should be parentName.name instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect column names when using computing aggregates
2 participants