Skip to content

Update to origin #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 82 commits into from
Jul 29, 2018
Merged

Update to origin #3

merged 82 commits into from
Jul 29, 2018

Conversation

justinormont
Copy link
Owner

No description provided.

Ivanidzo4ka and others added 30 commits June 12, 2018 10:20
…or (#338)

`CalibratorUtils.TrainCalibrator` and `TrainCalibratorIfNeeded` now creates `CalibratedPredictor` instead of `SchemaBindableCalibratedPredictor` whenever the predictor implements `IValueMapper`.
…356)

* Use HideEnumValueAttribute for both manifest and C# API generation.
* Unhide NAReplaceTransform.ReplacementKind.SpecifiedValue. This may require some other PR to resolve the corresponding issues.
When installing Microsoft.ML on an unsupported framework (like net452), it is currently getting installed successfully. However, users should be getting an error stating that net452 is not supported by this package.

The cause is the build files exist for any TFM, which NuGet interprets as this package supports any TFM. Moving the build files to be consistent with the 'lib' folder support.

Fix #357
…osal. (#369)

* Subclasses of `Stream` now have `Close` call `base.Close` to ensure disposal.
* Add DeleteOnClose to File opening.
* Remove explicit delete of file.
* Remove explicit close of substream.
* Since no longer deleting explicitly, no longer need `_overflowPath` member.
* Changed List to HashSet to ensure that there are no duplicates
* Update fast tree argument help text

* Update wording

* Update API to fix test

* Update core manifest JSON to update help text
* Add a way to create a single tree ensemble model from multiple tree ensemble models.

* Address PR comments, and fix bugs in serializing/deserializing RegressionTrees.

* Address PR comments.
add pipelineitem for Ova
…ryPoints.md and GraphRunner.md (#295)

* Adding EntryPoints.md and GraphRunner.md

* addressing PR feedback

* Updating the title of the GraphRunner.md file

* adressing Tom's feedback

* adressing feedback

* code formatting for class names

* Addressing Gal's comments

* Adding an example of an entry point. Fixing casing on ML.NET

* fixing link
Corrects an unintentional "typo" in FastTreeRanking.cs where there was mistakenly a USE_FASTTREENATIVE2 instead of USE_FASTTREENATIVE. This resulted in some obscure hidden ranking options (distance weighting, normalize query lambdas, and a few others) being unavailable. These are important for some applications.
* LightGBM and test.

* add test baselines and nuget source for lightGBM binaries.

* Add entrypoint for lightGBM.

* add unsafe flag for release build.

* update nuget version.

* make lightgbm test single threaded.

* install gcc on OS machines to resolve dependencies on openmp thatis needed by lightgbm native code.

* PR comments. Leave BREW and GCC in bash script to verify macOS tests work.

* remove brew and gcc from build script.

* PR feedback.

* disable test on macOS.

* disable test on macOS.

* PR feedback.
* Adding Factorization Machines
* ONNX API documentation.
Introduce Ensemble codebase
Create a shorter temp file name for model loading, as well as remove the potential for a race condition among multiple openings by using the creation of a lock file.
…lues (#394)

* Fix EvaluatorUtils to handle label column of type key without text key values.
* removing extraneous character that broke the linux build, and with it unecessary cmake version requirement

* Removing the BOM from the file
sharwell and others added 28 commits July 9, 2018 13:52
*Update error messages to point to GitHub issues instead of support group
…, in xml files. (#510)

* Moving from xml strings to having the documentation details in xml files.
For the summary text that is common between several learners, the examples will be added on a separate node.
An example of how that will look like is in the LogisticRegressionBinaryClassifier and LogisticRegressionClassifier.

* fixing the aftermath of renaming the XML file.

* removing the Desc from the EntryPoint attribute is a bad idea.

* removing the XML docs from the doc folder, and added them under the respective projects.

* Some OS get picky about casing.

* file name should be vanilla

* Fixing comment
…#428)

* Fix iris.txt dataset and modify Iris Classification tests accordingly

* Modify baseline test files for multiclass classification with Iris dataset and LightGBM
The test `TrainAndPredictIrisModelUsingDirectInstantiationTest` now has analogous changes to the `TrainAndPredictIrisModelTest` test
* First attempt at removing extra code comments

* Round #2

* Removing Microsoft.ML.InternalStreams per comment on #513

* Address notes from @Ivanidzo4ka

* Remove TreeOrderedCandidatesSearch

* Remove whitespace and reinstate commented out tests
* Adding arguments to PipelineSweep Macro

* updating the unit tests

* taking care of review comments; adding validations for Label, Weight, GroupId and Name columns

* taking care of some review comments

* some code cleanup

* addressing PR comments

* API changes to use RoleMappedData

* taking care of review comments

* using pipeline.UniqueId

* taking care of review comments. update ColumnPurpose 'Group' so it is consistent with Role 'Group'
This fixes a couple of dangling `cref` in the XML Docs. This commit
doesn't contain functional changes to the code.

Issue:
  This closes #434
…ithout files. (#472)

* Save Schema to context to support loading the model without files.
* Use the input file's schema if the file is available.
…#522)

* Conversion of ITrainer.Train returns predictor, accepts TrainContext

* `ITrainer.Train` returns a predictor. There is no `CreatePredictor` method
  on the interface.

* `ITrainer.Train` always accepts a `TrainContext`. Dataset type is no longer
  a generic parameter. This context object replaces the functionality
  previously offered by the combination of `ITrainer`, `IValidatingTrainer`,
  `IIncrementalTrainer`, and `IIncrementalValidatingTrainer`, which is now
  captured in one `ITrainer.Train` method with differently configured
  contexts.

* All trainers updated to these two new idioms. Many trainers correspondingly
  improved to no longer be stateful objects. (The exceptions are those that
  are just too far gone to be done with less than herculean effort at
  refactoring them to no longer use instance fields for their computation.
  Most notably, LBFGS and FastTree based trainers.)

* Utility code meant to deal with the complexity of the aforementioned
  `IT/IVT/IIT/IIVT` idiom reduced considerably.

* Opportunistic improvements to `ITrainer` implementors where observed.

* TrainerInfo introduction, ITrainerEx destruction

* Remove `IMetaLinearTrainer`
…ting the CSharpAPI (#529)

* Moving from xml strings to having the documentation details in xml files.
For the summary text that is common between several learners, the examples will be added on a separate node.
An example of how that will look like is in the LogisticRegressionBinaryClassifier and LogisticRegressionClassifier.

* fixing the aftermath of renaming the XML file.

* removing the Desc from the EntryPoint attribute is a bad idea.

* removing the XML docs from the doc folder, and added them under the respective projects.

* Some OS get picky about casing.

* file name should be vanilla

* Adding documentation for the first group of transforms

* adding more documentation.
changing the root of the XML documents from docs -> doc, since its only one.
Switching all <see href /> to the valid <see cref />

* formatting tweaks, and adressing most of the code comments.

* Extracted the examples outside of the member nodes in the xml, so that they only appear in the CSharpApi classes, and not on the runtime classes.

* small fixes

* addressing code comments

* addressing Pete's comments.

* Fixing language around the CharTokenizer description.

Closes #389
* Allow CpuMath to reference C# Hardware Intrinsics APIs.

Need to multi-target CpuMath for netstandard and netcoreapp3.0.  Also, since we are going to move CpuMath into its own NuGet package, remove the dependency from CpuMath to the ML.Core project.

Add a build parameter to enable building against .NET Core 3.0's Runtime Intrinsics APIs.

Fix #534

* Respond to PR feedback.
* failing test case for multiclass

* Refactored PipelineSweeperSupportedMetrics Class; added unit test for MultiClassClassification; refactored out unit tests for the PipelineSweeper

* take care of review comments; display transforms/learners + metrics in pipeline

* taking care of PR comments + refactor PipelineSweeperRunSummary

* taking care of review comments
* remove domain from onnx operators for non-ML types.

* Make ONNX compatible with Windows RS5 and add more tests.

* PR feedback.

* PR feedback.

* fix build.
* Remove Windows and Linux configurations from netci.groovy

* Add end of line to yml files

* Add badges and change leg name to Linux

* Not merge test results

* Add searchFolder to publish test results task
…upport for type conversion (#555)

* Don't fail in case of const field in Collection source.
Extend support for basic C# types for DataVIew<->collection conversion.
* Added a doc for schema comprehension
* Initial code analyzer for Microsoft.ML

Adds code analysis initially for correct usage of common Contracts.Except/Check
patterns, naming conventions, variable usage and initializations, and other idioms
used throughout the Microsoft.ML codebase. Enables analysis on Microsoft.ML projects.

Also added StyleCop, with most rules currently disabled, but rules on whitespace,
declaration modifier ordering, explicit access modifiers, and inteface naming check.

* Analyzer is .NET Standard 1.3 to avoid problems with IDE1003.

dotnet/roslyn#22368

* Projects in `src` can not use analyzer by setting `UseMLCodeAnalyzer` project property to false.
* Changed range of L2RegularizerWeight parameter in AveragedPerceptron
…erently when using with/without word tokenizer. (#548)
… name attribute for some examples (#592)

* Fixes issue 591: typos, adding the type to lists, and fixing the name attribute in OGD and Poisson

* getting just the content under the memeber node, not the member itself.

* merging from master
#587)

`DataViewConstructionUtils`'s methods to create dataviews over .NET types will now have correctly inferred "getters" in the case of sparse vectors.
@justinormont justinormont merged commit 24d939a into justinormont:sdca-l2-fix Jul 29, 2018
justinormont pushed a commit that referenced this pull request Nov 20, 2018
* Added placeholder

* Cleaned up Infos (replaced with ColumnPairs)

* Added ColumnInfo

* Added all the Create() methods.

* Added Mapper

* Commented out the EntryPoint

* Added PcaEstimator2

* PcaWorkout test passes

* Added pigsty api

* Fixed EntryPoint

* Fixed the arguments

* Fixed tests and added pigsty test

* Deleted Wrapped PCA transform

* Float -> float

* Cleaned docstrings

* Removed some unnecessary checks

* Simplified unnecessary code

* Moved some fields to ColumnInfo for simplifications

* Simplified weight columns

* Address PR comments #1

* Addressed PR comments #2

* Moved the static test

* PR comments #3

* Moved schema related information out of ColumnInfo and into Mapper.ColumnSchemaInfo.

* PR comments

* PR comments

* Updated manifest for entrypoint PcaCalculator

* Fixed schema exceptions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.