Skip to content

Conversation

@asmirnov82
Copy link
Contributor

@asmirnov82 asmirnov82 commented May 28, 2021

Fixes #5820

Extend DataFrame GroupBy operation by adding new property Groupings. This property returns collection of IGrouping objects (the same way as LINQ GroupBy operation does). This allows to use syntacs like this:

var groups = dataFrame.GroupBy<TKey>(columnName).Groupings.ToDictionary(g => g.Key, g => g.ToList()); foreach (DataFrameRow row in groups[KeyValue]) { // any code to work with row object }

Alexey Smirnov added 4 commits April 27, 2021 01:12
@pgovind pgovind self-requested a review May 28, 2021 21:18
@pgovind pgovind added the Microsoft.Data.Analysis All DataFrame related issues and PRs label May 28, 2021
@pgovind pgovind requested a review from eerhardt May 28, 2021 21:18
@eerhardt
Copy link
Member

using System;

Need a copyright header on all new files.


Refers to: test/Microsoft.Data.Analysis.Tests/DataFrameGroupByTests.cs:1 in 0580710. [](commit_id = 0580710, deletion_comment = False)

@codecov
Copy link

codecov bot commented May 28, 2021

Codecov Report

Merging #5821 (4afefd0) into main (7fafbf3) will increase coverage by 0.02%.
The diff coverage is 93.50%.

❗ Current head 4afefd0 differs from pull request most recent head c4d3ad2. Consider uploading reports for the commit c4d3ad2 to get more accurate results

@@            Coverage Diff             @@
##             main    #5821      +/-   ##
==========================================
+ Coverage   68.32%   68.35%   +0.02%     
==========================================
  Files        1131     1132       +1     
  Lines      241291   241368      +77     
  Branches    25053    25059       +6     
==========================================
+ Hits       164863   164978     +115     
+ Misses      69923    69888      -35     
+ Partials     6505     6502       -3     
Flag Coverage Δ
Debug 68.35% <93.50%> (+0.02%) ⬆️
production 62.96% <80.00%> (+0.02%) ⬆️
test 89.25% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Microsoft.Data.Analysis/DataFrame.cs 86.23% <80.00%> (-0.17%) ⬇️
src/Microsoft.Data.Analysis/GroupBy.cs 97.68% <80.00%> (-0.81%) ⬇️
...osoft.Data.Analysis.Tests/DataFrameGroupByTests.cs 100.00% <100.00%> (ø)
...StandardTrainers/Standard/LinearModelParameters.cs 66.32% <0.00%> (+0.25%) ⬆️
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 85.59% <0.00%> (+0.84%) ⬆️
...rosoft.ML.AutoML/ColumnInference/TextFileSample.cs 62.25% <0.00%> (+2.64%) ⬆️
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 85.03% <0.00%> (+3.14%) ⬆️
...c/Microsoft.Data.Analysis/StringDataFrameColumn.cs 70.40% <0.00%> (+6.12%) ⬆️
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 100.00% <0.00%> (+20.51%) ⬆️

@pgovind
Copy link

pgovind commented Jun 3, 2021

I pushed a commit to fix the remaining comments, so we can be ready for the next preview release. Great work here @asmirnov82, thanks!

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@pgovind pgovind merged commit f9b4b08 into dotnet:main Jun 3, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Mar 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Microsoft.Data.Analysis All DataFrame related issues and PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend GroupBy API of DataFrame with ability to iterate groups and rows

3 participants