You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently broadcasting on a GroupedDataFrame returns a Vector. This is inconsistent with map, which returns a GroupedDataFrame. Should we change this? I'd say yes:
a Vector doesn't carry any information about the groups, making the result almost useless
one can always use a comprehension to get a vector
If we agree to change this, we need to decide what to return exactly:
a GroupedDataFrame: consistent with map, which makes sense since broadcast and map return the same kind of objects in general in Base
a DataFrame: like combine, which may be more convenient since most operations are not supported on GroupedDataFrame
Maybe the solution is to return a GroupedDataFrame, but make that type behave more like a DataFrame (#1256). One issue is that a GroupedDataFrame doesn't make a lot of sense when each group contains a single row; so it depends on whether the most common use case for broadcast is to apply a function which returns multiple rows (like describe at #1539), or a single row.
The text was updated successfully, but these errors were encountered:
As of for what to do with GroupedDataFrame I am OK with whatever you would propose that is consistent and efficient as I guess you are understand the whole split-apply-combine infrastructure best (in the worst case the user can use combine after broadcast/map).
I think a grouped data frame is nice. The big drawback is DataFrame to scalar or vector operations. Thankfully DataFrames can hold whatever kind of stuff they want.
Currently broadcasting on a
GroupedDataFrame
returns aVector
. This is inconsistent withmap
, which returns aGroupedDataFrame
. Should we change this? I'd say yes:Vector
doesn't carry any information about the groups, making the result almost uselessIf we agree to change this, we need to decide what to return exactly:
GroupedDataFrame
: consistent withmap
, which makes sense sincebroadcast
andmap
return the same kind of objects in general in BaseDataFrame
: likecombine
, which may be more convenient since most operations are not supported onGroupedDataFrame
Maybe the solution is to return a
GroupedDataFrame
, but make that type behave more like aDataFrame
(#1256). One issue is that aGroupedDataFrame
doesn't make a lot of sense when each group contains a single row; so it depends on whether the most common use case forbroadcast
is to apply a function which returns multiple rows (likedescribe
at #1539), or a single row.The text was updated successfully, but these errors were encountered: