Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample of using LoadFromEnumerable with a SchemaDefinition #3696

Merged
merged 4 commits into from
May 20, 2019

Conversation

sfilipi
Copy link
Member

@sfilipi sfilipi commented May 9, 2019

I have seen a few asks for how to bypass having to define the size of the vector in the data models.
Adding an example for it.

@@ -0,0 +1,83 @@
using System;
Copy link
Contributor

@shauheen shauheen May 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add headers #Resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need headers for samples? whatever we put here, will show up in the docs verbatim, so we didn't include the license header in any of the samples. .NET samples don't have headers either (example below). So I suggest not including headers for samples files.

https://docs.microsoft.com/en-us/dotnet/api/system.io.file?view=netframework-4.8


In reply to: 282706294 [](ancestors = 282706294)

Copy link
Member Author

@sfilipi sfilipi May 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i forgot, i'll remove them. #Resolved

@codecov
Copy link

codecov bot commented May 10, 2019

Codecov Report

Merging #3696 into master will increase coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3696      +/-   ##
==========================================
+ Coverage   72.77%   72.78%   +<.01%     
==========================================
  Files         808      808              
  Lines      145588   145589       +1     
  Branches    16250    16250              
==========================================
+ Hits       105956   105968      +12     
+ Misses      35207    35199       -8     
+ Partials     4425     4422       -3
Flag Coverage Δ
#Debug 72.78% <ø> (ø) ⬆️
#production 68.28% <ø> (ø) ⬆️
#test 89.04% <ø> (ø) ⬆️
Impacted Files Coverage Δ
...soft.ML.Data/DataLoadSave/DataOperationsCatalog.cs 72.92% <ø> (ø) ⬆️
...luators/Metrics/MulticlassClassificationMetrics.cs 100% <0%> (ø) ⬆️
src/Microsoft.ML.LightGbm/LightGbmTrainerBase.cs 62.44% <0%> (ø) ⬆️
...rd/MulticlassClassification/OneVersusAllTrainer.cs 74.93% <0%> (+0.06%) ⬆️
...ML.Transforms/Text/StopWordsRemovingTransformer.cs 86.26% <0%> (+0.15%) ⬆️
...soft.ML.TestFramework/DataPipe/TestDataPipeBase.cs 73.93% <0%> (+0.32%) ⬆️
src/Microsoft.ML.Transforms/Text/LdaTransform.cs 89.89% <0%> (+0.62%) ⬆️
src/Microsoft.ML.Maml/MAML.cs 26.21% <0%> (+1.45%) ⬆️

@sfilipi sfilipi requested review from yaeldekel and codemzs May 10, 2019 19:00
};

// The feature dimension retrievable at runtime.
int featureDimension = 3;
Copy link
Member

@abgoswam abgoswam May 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a misplaced comment ? .. featureDimension is not used until line 61 for setting , not retrieving .. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sample is for cases when we don't know the size at compile time, so this is is set to a constant, but in real life would be the size of some array.

But good point, i don't have all that language there, and i don't think it is necessary; so I'll remove the comment :)


In reply to: 283514291 [](ancestors = 283514291)


namespace Samples.Dynamic
{
public static class LoadFromEnumerable
Copy link
Contributor

@glebuk glebuk May 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoadFromEnumerable [](start = 24, length = 18)

Add comment explaining that you are not only loading from enumerable but also setting vector metadata properties. Why whould this be useful IRL? #Resolved

Copy link

@shmoradims shmoradims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

var vectorItemType = ((VectorDataViewType)definedSchema[0].ColumnType).ItemType;
definedSchema[0].ColumnType = new VectorDataViewType(vectorItemType, featureDimension);

// Read the data into an IDataView with the schema supplied in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Read the data into an IDataView with the schema supplied in
// Read the data into an IDataView with the modified schema supplied in

@sfilipi sfilipi merged commit 706299b into dotnet:master May 20, 2019
@sfilipi sfilipi deleted the runtimeVectorSize branch May 20, 2019 22:42
@ghost ghost locked as resolved and limited conversation to collaborators Mar 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants