Skip to content

API reference - Samples for Transforms #1209

Closed
@sfilipi

Description

@sfilipi

We need to add samples on how to use the new transformer, and estimators than reference those samples from the XML documentation so that in docs.microsoft.com users can copy/paste the sample and have a head-starts.

Mot of the tests that got added as part of the transformer work are a good start for creating a sample.

MLContext Catalogs

Catalog Total APIs Samples Owner Samples Status / ETA
MLContext.Transforms (root) 19 Senja Remaining: 4 overrides for the normalizer multicolumn examples
MLContext.Transforms.Categorical 2 ZeeshanA Done v1
MLContext.Transforms.Conversion 6 Senja DoneV1
MLContext.Transforms.FeatureSelection 4 ZeeshanA Done v1
MLContext.Transforms.TimeSeries 4 Senja Done V1
MLContext.Transforms.Text 29 ZeeshanA Done V1
MLContext.Data 10 Senja DoneV1
MLContext.Model (root) 4 ZeeshanS  DoneV1  

P0+P1 Public API (extension methods) per Catalog

MLContext.Transforms (root) Num Overloads Documentation Sample API Owner
CopyColumns 2 Yes 2 Can remove dependency on DatasetUtils. Zeeshan
Concatenate 1 Yes, needs improvement. 1 - Can remove dependency on DatasetUtils. Zeeshan
DropColumns 1 Yes 1 Can remove dependency on DatasetUtils. Zeeshan
SelectColumns 2 Yes, needs improvement. 2 - Can remove dependency on DatasetUtils. Zeeshan
Normalize 1 Done. 1 #3244 Ivan
CustomMapping 1 Yes, needs improvement. Done-v1 #3275 Artidoro
IndicateMissingValues 2 Done-v1 #3275 Artidoro
ReplaceMissingValues 2 Done-v1 #3275 Artidoro
ConvertToGrayscale 1 Yes, needs fixes. Example not displaying. 1 #3165 Abhishek
LoadImages 1 Yes, needs fixes. Example not displaying. 1 #3165 Abhishek
ExtractPixels 2 Yes, needs fixes. Example not displaying. 1 #3165 Abhishek
ResizeImages 2 Yes. Example not displaying. 1 #3165 Abhishek
ConvertToImage 2 Yes. 1 #3165 Abhishek
IidChangePointEstimator 1 1- Done Senja
IidSpikeEstimator 1 1 - Done Senja
SsaChangePointEstimator 1 1 - Done Senja
SsaSpikeEstimator 1 1 - Done Senja
ApplyOnnxModel 3 DoneV1 #3349 Gani
DnnFeaturizeImage 1 Yes, needs improvement. 1 - Done Senja
NormalizeGlobalContrast 1 Done 0 #3232 Ivan
NormalizeLpNorm 1 Done. 0 #3232 Ivan
ApproximatedKernelMap 1 Done 0 #3232 Ivan
mlContext.Transforms. CalculateFeatureContribution 1 Yes, needs improvement Rogan
MLContext.Transforms.Categorical Num Overloads Documentation Sample API Owner
OneHotEncoding 2 2 #3179 Abhishek
OneHotHashEncoding 2 2 #3179 Abhishek
MLContext.Transforms.Conversion Num Overloads Documentation Sample API Owner
Hash 2 can't find the API Done Senja
ConvertType 2 Yes, needs improvement. Done Senja
MapKeyToValue 2 Yes, needs improvement. Done Senja
MapKeyToVector 2 Yes, needs improvement. Done Senja
MapValueToKey 2 Yes. Done Senja
MapKeyToBinaryVector 2 Yes, needs improvement. Done Senja
MLContext.Transforms.FeatureSelection Num Overloads Documentation Sample API Owner
SelectFeaturesBasedOnMutualInformation 2 need a better example to show MI computation. something like this 2 #3184 Abhishek
SelectFeaturesBasedOnCount 2 2 #3184 Abhishek
MLContext.Transforms.Text Num Overloads Documentation Sample API Owner
FeaturizeText 2 #3120 Zeeshan
TokenizeCharacters 1 #3123 Zeeshan
NormalizeText 1 #3133 Zeeshan
ExtractWordEmbeddings 1 #3142 Zeeshan
TokenizeWords 1 #3156 Zeeshan
ProduceNgrams 3 #3177 Zeeshan
RemoveDefaultStopWords 2 #3156 Zeeshan
RemoveStopWords 2 #3156 Zeeshan
ProduceWordBags 3 #3183 Zeeshan
ProduceHashedWordBags 3 #3183 Zeeshan
ProduceHashedNgrams 3 #3177 Zeeshan
LatentDirichletAllocation 2 #3191 Zeeshan

For the Data catalog, all API's documentations needs to be augmented with suggestions for when would one use this API.

MLContext.Data Num Overloads Documentation Sample API Owner
LoadFromEnumerable 1 Done. 1 - Done. Senja
CreateEnumerable 2 Done. The second overload of this API is a P4 scenario. the use case for that API would be: users has a model which has slot names preserved for the features, and when they load the models, they also get the schema out of the loaded model and pass that schema, together with the TRow type they want to load the data to this API. This API will then populate the Annotations (former metadata) for the feature column. 1 Senja
BootstrapSample 1 Done. 1 - Done. Senja
Cache 1 Done. 1 - Done. Senja
FilterRowsByColumn 1 Done. 1 - Done. Senja
FilterRowsByKeyColumnFraction 1 Done. 1 - Done. Senja
FilterRowsByMissingValues 1 Done. 1 - Done. Senja
ShuffleRows 1 Done. 1 - Done. Senja
SkipRows 1 Done. 1 - Done. Senja
TakeRows 1 Done. 1 - Done. Senja
Other Num Overloads Documentation Sample API Owner
Permutation Feature Importance 4 Yes, but needs work Yes, but needs work Rogan

Metadata

Metadata

Labels

documentationRelated to documentation of ML.NET

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions