-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow empty keys
argument in by()
#1837
Comments
Currently |
In R: library(dplyr)
starwars %>%
group_by(species) %>%
summarise(
n = n(),
mass = mean(mass, na.rm = TRUE)
) returns # A tibble: 38 x 3
species n mass
<chr> <int> <dbl>
1 <NA> 5 48
2 Aleena 1 15
3 Besalisk 1 102
4 Cerean 1 82
5 Chagrian 1 NaN
6 Clawdite 1 55
7 Droid 5 69.8
8 Dug 1 40
9 Ewok 1 20
10 Geonosian 1 80
# ... with 28 more rows And leaving the group_by empty just aggregates across all: starwars %>%
group_by() %>%
summarise(
n = n(),
mass = mean(mass, na.rm = TRUE)
) returns: # A tibble: 1 x 2
n mass
<int> <dbl>
1 87 97.3 |
While R does (above), Pandas does not: > grouped = df.groupby()
TypeError: You have to supply one of 'by' and 'level' |
OK, then I guess we could support this too. Though that's not high priority for us, but feel free to make a pull request. |
fixed |
When exploring data or creating a function to abstract the process of generating a fixed set of results for different combinations of variables, it would be nice to be able to pass an empty array which would simply "group-by" all the rows in the dataset, e.g.
would return
Currently it errors:
The text was updated successfully, but these errors were encountered: