ONNX API documentation. #419

codemzs · 2018-06-26T18:30:56Z

fixes #418 by adding XML documentation for converting ML.NET models to ONNX.

TomFinley · 2018-06-26T18:41:00Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        ///
+        /// This API converts the model to ONNX format by inspecting the transform pipeline 
+        /// from the end, checking for components that know how to save themselves as ONNX. 
+        /// The first data view in the transform pipeline that does not know how to save itself 


data view [](start = 22, length = 9)

While data view is a central concept in the underlying implementation, I don't think that the current API actually exposes this concept in any meaningful fashion, does it? Which means that this is an undefined concept, from the point of view of an API user. #Closed

Maybe just "item" for now, since from context it is clear that you're talking about transforms.

In reply to: 198256303 [](ancestors = 198256303)

TomFinley · 2018-06-26T18:41:48Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// as ONNX, is considered the "input" to the ONNX pipeline. (Ideally this would be the 
+        /// original loader itself, but this may not be possible if the user used unsavable 
+        /// transforms in defining the pipe.) All the columns in the source that are a type the 
+        /// ONNX knows what to deal with will be tracked. Intermediate transformations of the 


what to deal [](start = 23, length = 12)

"what to deal" => "how to deal" #Closed

TomFinley · 2018-06-26T18:42:36Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        ///
+        /// This API supports the following arguments:
+        /// <see cref="Onnx"/> indicates the file to write the ONNX protocol buffer file to.
+        /// <see cref="Json"/> indicates the file to write the JSON representation of the ONNX model.


indicates the file to write the JSON representation of the ONNX model. [](start = 31, length = 70)

Is that optional? #Closed

Yes.

In reply to: 198256809 [](ancestors = 198256809)

zeahmed · 2018-06-26T20:04:05Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// path defined through the Json option.
+        ///
+        /// This API supports the following arguments:
+        /// <see cref="Onnx"/> indicates the file to write the ONNX protocol buffer file to. This is optional.


At least one of the Onnx or Json file is required, right? From the description, I feel both are optional. #Resolved

Both are optional, if you don't specify Onnx path then it won't save the model anywhere.

if (_outputModelPath != null) { using (var file = Host.CreateOutputFile(_outputModelPath)) using (var stream = file.CreateWriteStream()) model.WriteTo(stream); } if (_outputJsonModelPath != null) { using (var file = Host.CreateOutputFile(_outputJsonModelPath)) using (var stream = file.CreateWriteStream()) using (var writer = new StreamWriter(stream)) { var parsedJson = JsonConvert.DeserializeObject(model.ToString()); writer.Write(JsonConvert.SerializeObject(parsedJson, Formatting.Indented)); } } #Resolved

That's fairly peculiar is it not? Is there a usecase that justifies that strange behavior? #Resolved

Unless the user wants to inspect the model under the debugger without actually saving it, may be to see what all components got converted, etc. We should open an issue if we want to make this required. #Resolved

I have opened issue #423 against this.

TomFinley · 2018-06-26T21:35:58Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// <see cref="Name"/> indicates the name property in the ONNX model. If left unspecified, it will 
+        /// be the extension-less name of the file specified in the onnx indicates the protocol buffer file
+        /// to write the ONNX representation to.
+        /// <see cref="Domain"/> indicates the domain name of the model. We use reverse domain name space indicators.


We [](start = 73, length = 2)

ONNX, not us, right? #Resolved

TomFinley · 2018-06-26T21:37:35Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// Learners that can be exported to ONNX
+        /// 1. FastTree
+        /// 2. LightGBM
+        /// 3. LibSVM


LibSVM [](start = 15, length = 6)

This has not yet been migrated, so should probably not be included. #Resolved

TomFinley · 2018-06-26T21:38:38Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// 2. LightGBM
+        /// 3. LibSVM
+        /// 4. Multi Class Logistic Regression
+        /// 5. Logistic Regression


This is somewhat peculiar, that we call out both LR and multi-class LR, when we haven't done the same for all the tasks FastTree can accomplish, for example. #Resolved

TomFinley · 2018-06-26T21:39:49Z

src/Microsoft.ML/Models/OnnxConverter.cs

@@ -10,9 +10,50 @@ namespace Microsoft.ML.Models
    public sealed partial class OnnxConverter
    {
        /// <summary>
-        /// Converts the model to ONNX format.
+        ///
+        /// This API converts the model to ONNX format by inspecting the transform pipeline 


Is there a purpose to this blank line here? #Resolved

GalOshri · 2018-06-26T21:40:22Z

src/Microsoft.ML/Models/OnnxConverter.cs

@@ -10,9 +10,50 @@ namespace Microsoft.ML.Models
    public sealed partial class OnnxConverter
    {
        /// <summary>
-        /// Converts the model to ONNX format.
+        ///
+        /// This API converts the model to ONNX format by inspecting the transform pipeline 


Might be worth clarifying that it converts an ML.NET model #Resolved

GalOshri · 2018-06-26T21:41:18Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// transforms in defining the pipe.) All the columns in the source that are a type the 
+        /// ONNX knows how to deal with will be tracked. Intermediate transformations of the 
+        /// data appearing as new columns will appear in the output block of the ONNX, with names 
+        /// derived from the corresponding column names. The ONNX json will be serialized to a 


json -> JSON #Resolved

TomFinley · 2018-06-26T21:42:11Z

src/Microsoft.ML/Models/OnnxConverter.cs

@@ -10,9 +10,50 @@ namespace Microsoft.ML.Models
    public sealed partial class OnnxConverter
    {
        /// <summary>
-        /// Converts the model to ONNX format.
+        ///


/// [](start = 8, length = 3)

Would a link to ONNX project itself be appropriate, to motivate why someone might care? #Resolved

GalOshri · 2018-06-26T21:44:23Z

src/Microsoft.ML/Models/OnnxConverter.cs

@@ -10,9 +10,50 @@ namespace Microsoft.ML.Models
    public sealed partial class OnnxConverter
    {
        /// <summary>
-        /// Converts the model to ONNX format.
+        ///


It might be worth providing some context on ONNX (at least linking to the ONNX website) and explaining why this is valuable. Similarly for Windows ML. #Resolved

GalOshri · 2018-06-26T22:29:09Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// <see href="https://onnx.ai/">ONNX</see> is an intermediate representation format 
+        /// for machine learning models. It is used to make models portable such that you can 
+        /// train a model using a toolkit and run it in another tookit's runtime, for example,
+        /// you can create a model using ML.NET(or any ONNX compatible toolkit), convert it to ONNX and 


missing a space after ML.NET #Resolved

GalOshri · 2018-06-26T22:29:43Z

src/Microsoft.ML/Models/OnnxConverter.cs

        /// from the end, checking for components that know how to save themselves as ONNX. 
        /// The first item in the transform pipeline that does not know how to save itself 
        /// as ONNX, is considered the "input" to the ONNX pipeline. (Ideally this would be the 
        /// original loader itself, but this may not be possible if the user used unsavable 
        /// transforms in defining the pipe.) All the columns in the source that are a type the 
        /// ONNX knows how to deal with will be tracked. Intermediate transformations of the 
        /// data appearing as new columns will appear in the output block of the ONNX, with names 
-        /// derived from the corresponding column names. The ONNX json will be serialized to a 
+        /// derived from the corresponding column names. The ONNX JSON will be serialized to a 
        /// path defined through the Json option.


Json -> JSON #WontFix

Json is the name of the command line option. #Resolved

GalOshri · 2018-06-26T22:30:44Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// you can create a model using ML.NET(or any ONNX compatible toolkit), convert it to ONNX and 
+        /// then the ONNX model can be converted into say, CoreML, TensorFlow or WinML model 
+        /// to run on the respective runtime.
+        /// This API converts an ML.NET model to ONNX format by inspecting the transform pipeline 


Add newline #Resolved

GalOshri · 2018-06-26T22:31:48Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// 5. Term
+        /// 6. Categorical
+        /// 
+        /// Learners that can be exported to ONNX


Is it worth referring to what it would take to expand this list?

I'm not sure if it is worth revealing anything that we are not sure about or isn't public.

In reply to: 198317926 [](ancestors = 198317926)

But we can say these are the most used learners.

In reply to: 198318122 [](ancestors = 198318122,198317926)

codemzs · 2018-06-26T22:42:05Z

@shauheen @TomFinley @asthana86 @zeahmed Can y'all please take a look and sign-off?

GalOshri

Looks good. Thank you for adding this!

TomFinley

Would prefer signoff from @asthana86 or @OliaG rather than me, but we'll let this stand for now. Thanks @codemzs .

OliaG · 2018-06-26T23:48:53Z

src/Microsoft.ML/Models/OnnxConverter.cs

+        /// <see cref="InputsToDrop"/> is a string array of input column names to omit from the input mapping.
+        /// A common scenario might be to drop the label column, for instance, since it may not be practically
+        /// useful for the pipeline. Note that any columns depending on these naturally cannot be saved.
+        /// <see cref="OutputsToDrop"/> is similar, except for the output schema. Note that the pipeline handler 


even though you can guess from the context, I'd say "similar to InputsToDrop"

OliaG · 2018-06-26T23:50:06Z

Looks great, thanks for doing it!

* Bump master to v0.3 (#269) * RocketEngine fix for selecting top learners (#270) * Changes to RocketEngine to fix take top k logic. * Add namespace information to allow file to reference correct version of Formatting object. * small code cleanup (#271) * Preparation for syncing sources with internal repo (#275) * make class partial so I can add constuctor in separate file. add constructros for testing * formatting * Changes to use evaluator metrics names in PipelineSweeperSupportedMetrics. Made the private const strings in two classes public. (#276) * add missing subcomponents to sweepers (#278) * add missing subcomponents * right one * more cleanup * remove lotus references. (#252) * Random seed and concurrency for tests (#277) * first attempt * add comments * specify seed for random. make constructor internal. * Fix SupportedMetric.ByName() method (#280) * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Removed unnecessary field filter, per review comment. * ML.NET-242: FastTreeRanking per-iteration loss metrics are empty (#289) When training a FastTreeRanker using the `testFrequency` parameter, it is expected that NDCG is prented every testFrequency iterations. However, instead of NDCG, only empty strings are printed. The root cause was that the MaxDCG property of the dataset was never calculated, so the NDCG calculation is aborted, leaving an empty string as a result. This PR fixes the problem by computing the MaxDCG for the dataset when the Tests are defined (so that if the tests are not defined, the MaxDCG will never be calculated). Closes #242 * Fixed typo in the method summary (#296) * Remove stale line of code from test. (#297) * Update release notes link to use aka.ms. (#294) Our release notes link is broken because the `Documentation` was renamed to `docs`. Fix this for the future to use a redirection link. * Add release notes for ML.NET 0.2 (#301) * Add release notes for ML.NET 0.2 * Adding release note about TextLoader changes and additional issue/PR references * Addressing comments: fixing typos, changing formatting, and adding references * Get the cross validation macro to work with non-default column names (#291) * Add label/grou/weight column name arguments to CV and train-test macros * Fix unit test. * Merge. * Update CSharp API. * Fix EntryPointCatalog test. * Address PR comments. * update sample in README.MD with 0.2 features. (#304) * update sample with new text loader API. * update with 0.2 stuff. * OVA should respect normalization in underlying learner (#310) * Respect normalization in OVA. * some cleanup * fix copypaste issues * Export to ONNX and cross-platform command-line tool to script ML.NET training and inference (#248) * Export to ONNX and Maml cross-platform executable. * Add Cluster evaluator (#316) * Add Cluster evaluator * fix copypaste * address comments * formatting * Fixes locale dependent test output comparisons (#109) The tests do not pass on systems with locale other than en-US. The error happens since the results are written to files and the contents of the files are compared to set of correct results produced under en-US locale. The fix is to imbue en-US culture to the test thread so that results will be output in format that is comparable with the test format. This patch fixes only tests, but do not guarantee calculation will be correct in production systems using a locale different than en-US. In particular, there can be problems in reading data and then conversing data from characters to numeric format. Fixes #74 * Add PartitionedFileLoader (#61) * Remove unexisting project from solution (#335) * GetSummaryDataView/Row implementation for Pca and Linear Predictors (#185) * Implement `ICanGetSummaryAsIDataView` on `PcaPredictor` class * Implement `ICanGetSummaryAsIRow` on `LinearPredictor` class * Disable ordinary least squares by removing the entry point (#286) * Disable ols by temporarily removing the entry point. It may be added again once we figure out how to ship MKL as part of this project. * add append function to pipeline (#284) Add `Append` function to pipeline for more fluent API than that allowed by `Add` * Removed field/column name checking of input type in TextLoader. (#327) * fix namespace issue in CSharpGenerator and some refactoring (#339) fix namespace issue and refactoring * Using named-tuple in OneToOneTransforms' constructor to make API more readable. (#324) * Minor formatting in CollectionDataSourceTests.cs (#348) * Create CalibratedPredictor instead of SchemaBindableCalibratedPredictor (#338) `CalibratorUtils.TrainCalibrator` and `TrainCalibratorIfNeeded` now creates `CalibratedPredictor` instead of `SchemaBindableCalibratedPredictor` whenever the predictor implements `IValueMapper`. * Remove reference and dependency on System.ValueTuple (#351) * Add link to samples (#355) * Use HideEnumValueAttribute for both manifest and C# API generation. (#356) * Use HideEnumValueAttribute for both manifest and C# API generation. * Unhide NAReplaceTransform.ReplacementKind.SpecifiedValue. This may require some other PR to resolve the corresponding issues. * Move the NuGet package build files into a TFM specific directory. (#370) When installing Microsoft.ML on an unsupported framework (like net452), it is currently getting installed successfully. However, users should be getting an error stating that net452 is not supported by this package. The cause is the build files exist for any TFM, which NuGet interprets as this package supports any TFM. Moving the build files to be consistent with the 'lib' folder support. Fix #357 * `Stream` subclasses now have `Close` call `base.Close` to ensure disposal. (#369) * Subclasses of `Stream` now have `Close` call `base.Close` to ensure disposal. * Add DeleteOnClose to File opening. * Remove explicit delete of file. * Remove explicit close of substream. * Since no longer deleting explicitly, no longer need `_overflowPath` member. * Return distinct array of ParameterSet when ProposeSweep is called (#368) * Changed List to HashSet to ensure that there are no duplicates * Update fast tree argument help text (#372) * Update fast tree argument help text * Update wording * Update API to fix test * Update core manifest JSON to update help text * Combine multiple tree ensemble models into a single tree ensemble (#364) * Add a way to create a single tree ensemble model from multiple tree ensemble models. * Address PR comments, and fix bugs in serializing/deserializing RegressionTrees. * Address PR comments. * add pipelineitem for Ova (#363) add pipelineitem for Ova * Fix CV macro to output the warnings data view properly. (#385) * Link to an example on using converting ML.NET model to ONNX. (#386) * Adding documentation about entry points, and entry points graphs: EntryPoints.md and GraphRunner.md (#295) * Adding EntryPoints.md and GraphRunner.md * addressing PR feedback * Updating the title of the GraphRunner.md file * adressing Tom's feedback * adressing feedback * code formatting for class names * Addressing Gal's comments * Adding an example of an entry point. Fixing casing on ML.NET * fixing link * Adding LDA Transform (#377) * Revert to using the native code (#413) Corrects an unintentional "typo" in FastTreeRanking.cs where there was mistakenly a USE_FASTTREENATIVE2 instead of USE_FASTTREENATIVE. This resulted in some obscure hidden ranking options (distance weighting, normalize query lambdas, and a few others) being unavailable. These are important for some applications. * LightGBM (#392) * LightGBM and test. * add test baselines and nuget source for lightGBM binaries. * Add entrypoint for lightGBM. * add unsafe flag for release build. * update nuget version. * make lightgbm test single threaded. * install gcc on OS machines to resolve dependencies on openmp thatis needed by lightgbm native code. * PR comments. Leave BREW and GCC in bash script to verify macOS tests work. * remove brew and gcc from build script. * PR feedback. * disable test on macOS. * disable test on macOS. * PR feedback. * Adding Factorization Machines (#383) * Adding Factorization Machines * ONNX API documentation. (#419) * ONNX API documentation. * Bring ensembles into codebase (#379) Introduce Ensemble codebase * enable macOS tests for LightGBM. (#422) * Create a shorter temp file name for model loading. (#397) Create a shorter temp file name for model loading, as well as remove the potential for a race condition among multiple openings by using the creation of a lock file. * removing extraneous character that broke the linux build, and with it unecessary cmake version requirement (#425) * EvaluatorUtils to handle label column of type key without text key values (#394) * Fix EvaluatorUtils to handle label column of type key without text key values. * Removing non source files from solution (#362)

* ONNX API documentation.

ONNX API documentation.

fc17a17

codemzs requested review from shauheen, asthana86 and zeahmed June 26, 2018 18:30

add list of transforms and learners that can be converted to ONNX.

3619573

TomFinley reviewed Jun 26, 2018

View reviewed changes

PR feedback.

e31d4ca

zeahmed reviewed Jun 26, 2018

View reviewed changes

TomFinley reviewed Jun 26, 2018

View reviewed changes

codemzs requested a review from GalOshri June 26, 2018 21:36

TomFinley reviewed Jun 26, 2018

View reviewed changes

GalOshri reviewed Jun 26, 2018

View reviewed changes

TomFinley reviewed Jun 26, 2018

View reviewed changes

GalOshri reviewed Jun 26, 2018

View reviewed changes

codemzs added 2 commits June 26, 2018 15:08

PR feedback.

ca7942c

fix typo.

48ab0b4

codemzs closed this Jun 26, 2018

codemzs deleted the onnx branch June 26, 2018 22:14

codemzs restored the onnx branch June 26, 2018 22:15

codemzs reopened this Jun 26, 2018

GalOshri reviewed Jun 26, 2018

View reviewed changes

PR feedback.

67e1d00

GalOshri approved these changes Jun 26, 2018

View reviewed changes

TomFinley approved these changes Jun 26, 2018

View reviewed changes

zeahmed approved these changes Jun 26, 2018

View reviewed changes

OliaG reviewed Jun 26, 2018

View reviewed changes

OliaG self-requested a review June 26, 2018 23:50

OliaG approved these changes Jun 26, 2018

View reviewed changes

codemzs merged commit 17f944c into dotnet:master Jun 26, 2018

codemzs deleted the onnx branch June 26, 2018 23:52

eerhardt pushed a commit to eerhardt/machinelearning that referenced this pull request Jul 27, 2018

ONNX API documentation. (dotnet#419)

33c28a7

* ONNX API documentation.

ghost locked as resolved and limited conversation to collaborators Mar 30, 2022

ONNX API documentation. #419

ONNX API documentation. #419

Uh oh!

Conversation

codemzs commented Jun 26, 2018

Uh oh!

TomFinley Jun 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codemzs Jun 26, 2018

Choose a reason for hiding this comment

Uh oh!

zeahmed Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codemzs Jun 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codemzs Jun 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zeahmed Jun 26, 2018

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GalOshri Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GalOshri Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GalOshri Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GalOshri Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GalOshri Jun 26, 2018 • edited by codemzs Loading Uh oh! There was an error while loading. Please reload this page.

TomFinley Jun 26, 2018 •

edited

Loading

TomFinley Jun 26, 2018 •

edited

Loading

TomFinley Jun 26, 2018 •

edited

Loading

zeahmed Jun 26, 2018 •

edited by codemzs

Loading

codemzs Jun 26, 2018 •

edited

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading

codemzs Jun 26, 2018 •

edited

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading

GalOshri Jun 26, 2018 •

edited by codemzs

Loading

GalOshri Jun 26, 2018 •

edited by codemzs

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading

GalOshri Jun 26, 2018 •

edited by codemzs

Loading

GalOshri Jun 26, 2018 •

edited by codemzs

Loading

GalOshri Jun 26, 2018 •

edited by codemzs

Loading

codemzs Jun 26, 2018 •

edited

Loading

GalOshri Jun 26, 2018 •

edited by codemzs

Loading

codemzs Jun 26, 2018 •

edited

Loading

codemzs commented Jun 26, 2018 •

edited

Loading