Skip to content

Pull from upstream #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 65 commits into from
Jun 4, 2018
Merged

Pull from upstream #1

merged 65 commits into from
Jun 4, 2018

Conversation

justinormont
Copy link
Owner

No description provided.

forki and others added 30 commits May 9, 2018 09:00
Added a badge for NuGet status and fixed a few typos on the readme.md
We've shipped 0.1.0, we should start producing higher versions now.

Fix #85
…lpful. (#50)

* Comments added to LearningPipeline class in accordance with #Bug 240636: Intellisense is not helpful with filling in pipeline components.

* Comments added to LearningPipeline class in accordance with #Bug 240636: Intellisense is not helpful with filling in pipeline components.

* Fixed a typo in namespace

* Addressed reviewers' comments.

* Addressed reviewers' comments.

* Addressed reviewers' comments.
* Issue #104: Update the build tools to 2.1.200

Issues:
  This closes #104

* Updating .NET Version in the right file.
… '+' signs in them. (#102)

* Handle generic types and types with multiple '+' signs in them.

* Delete commented code.

* Address PR comments.

* Skip codegen unit test.

* Add example to comment.

* Assert that the full type name has a dot in it.

* Trigger build.
… required but not found." (#121)

* Checking for both ColumnAttribute and ColumnNameAttribute when creating schema in CreateBatchPredictionEngine.

* Addressed reviewers' comments.
Make a 'not supported field type' exception more readable, so the developer could figure out why he can't load the data
This closes #128
…there is '#' infront of header row in Iris.txt.

* Removed '#' from the start of header line in Iris.txt. '#' causes header to be ignored.

* Updating test instead of iris.txt because the file is being used at many places.
…stic (#133)

* Removed calls to DateTime.Now

The codebase now uses DateTime.UtcNow, instead of DateTime.Now, to be locale agnostic, except in cases where timezone info is actually needed. Also replaced one starttime measurement with stopwatch.
…ized Additive Models predictor resilient when feature map is not available. (#122)

* Instantiate feature map for disk transpose and make Generalized Additive Models predictor resilient when feature map is not available.
* Update NuGet packages to fill out all metadata.

Also, a minor build change (move property ordering) to fix SourceLink with our packages.

Fix #43
Fix #103

* Adding package icon URL.

* Update Parquet package description.

* Add source code control properties to the NuGet packages.

Also, fix a small bug with the nupkgproj files. The intermediate output folders conflict between the nupkgproj and csproj with the same name. This causes issues because the project.assets.json file is being shared between the two projects, which isn't correct.
…er entry point names (#113)

* Update suffix of trainer entry point names by trainer kind.

* Address PR comments.

* Add unit test.

* Update C# API

* Move unit test to TestAutoInference and fix EntryPointCatalog test.

* Trigger build.

* Add reference to the test project to make the sweeper entry point visible to EntryPointCatalog test.
The previous 2 changes conflicted.  Resovling the break that happened between them.
Add symbols package for ML.Parquet package.
Put common NuGet package logic in props file.

Fix #144
…ading data from file) (#106)

* in memory loader

* add test file for memory collection

* even in afterlife EntryPointCatalog will chase me down.

* Address some comments.

* update tests

* address more comments.

* remove empty param description

* hide collectionloader

* refactor classes a little.

* pesky new lines!

* slightly better comments. but only slighty

* rename it

* make class static

* not a loader

* remove alias in entrypoint

* address comments
…eline (#154)

* Prevent learning pipeline from adding null transform model to the pipeline.
* Add test.
* Benchmark

* Changed to .NET Core app

* Added Accuracy Reporting

* fixed build

* Feedback from Gleb

* Added batch prediction tests

* Resolved conflicts the sln file

* Renamed the new file to match type name

* Removed duplicated method
* Publishing nuget packages to myget feed.

Also - set the symbols expiration days default based on feedback from the .NET core-eng team.

Fixes #11

* Shorten nuget push timeout to match corefx and coreclr.
…#131)

* Fix a bug in Tree leaf featurizer entry point, and add a test for it.

* Improve unit test

* Update unit test

* Decrease number of trees and leaves in unit test
* no need to add combiner if you don't have transforms.

* fix NextSigned
* Changing name "Documentation" to "docs" for consistency in the repo.
Fixes #143
Removed two NextFloat() extension methods from RandomUtils and replaced all usage of them with `IRandom.NextSingle()`.
Copy our native assemblies using MSBuild when a consumer is using NuGet packages.config, since NuGet doesn't do this automatically.

Also, add an error when a project is not targeting x64. ML.NET only supports x64.

Fix #93
 handle boolean type in construction utils.
*Code generate support for IDataLoader
*Make TextLoader API code generated so that it's at functional parity with the text loader in the ML.Net infrastructure.
*Move TextLoader API under Microsoft.ML.Data namespace
*Add convenience TextLoader API.
*Add error checking for invalid loader arguments such as ordinal, column names.
*Update baselines.
*Update samples with new loader API and backward compatibility with old loader API.
Anipik and others added 25 commits May 24, 2018 15:05
* Test Enabled, Zbaseline files added for debug and release
* Test Enabled, Zbaseline files added
* Test Enabled, added Debug and release zbaselines
* Test Enabled, Debug and Release baseline files added
* example

* add Clusters tests

* cleanup

* address comments

* bring clustering reference back

* rephrasing
Add small fix in Microsof.ML.sln
* Scores to label mapping for multi-class classification problem.
*Cross Validation.
*Train Test.
* refactor code from test into functions
make it more readable

* sprinkle some vars
…263)

* Move ZBaselines to test/BaselineOutput
* Fix the path in BaseTestBaseline
* Move Samples\UCI to test\data
…assifierTesterThresholdingTest (#255)

* Tests Enabled & Dataset Moved to correct place in test\BaselineOutput
* Correcting path for adult data set for autoInference class, and removing @ from path
)

* Linear classifier test enabled
* Files added to test\BaselineOutput
* Extra space removed
* Average Preceptron Pav Caliberator test enabled
Spaces in build scripts now properly quoted.
* introduce IUnsupervisedLearningWithWeights

* add test to check KMeans don't need label and can handle presence of weight column.
also extract real weight value from cursor.
* Changes to RocketEngine to fix take top k logic.

* Add namespace information to allow file to reference correct version of Formatting object.
* make class partial so I can add constuctor in separate file. add constructros for testing

* formatting
…rics. Made the private const strings in two classes public. (#276)
* add missing subcomponents

* right one

* more cleanup
* first attempt

* add comments

* specify seed for random.
make constructor internal.
* Fix for SupportedMetric.ByName() method. Include new unit test for function.

* Fix for SupportedMetric.ByName() method. Include new unit test for function.

* Fix for SupportedMetric.ByName() method. Include new unit test for function.

* Removed unnecessary field filter, per review comment.
When training a FastTreeRanker using the `testFrequency` parameter, it is expected that NDCG is prented every testFrequency iterations. However, instead of NDCG, only empty strings are printed.

The root cause was that the MaxDCG property of the dataset was never calculated, so the NDCG calculation is aborted, leaving an empty string as a result.

This PR fixes the problem by computing the MaxDCG for the dataset when the Tests are defined (so that if the tests are not defined, the MaxDCG will never be calculated).

Closes #242
@justinormont justinormont merged commit f03189d into justinormont:master Jun 4, 2018
justinormont pushed a commit that referenced this pull request Nov 20, 2018
* Added placeholder

* Cleaned up Infos (replaced with ColumnPairs)

* Added ColumnInfo

* Added all the Create() methods.

* Added Mapper

* Commented out the EntryPoint

* Added PcaEstimator2

* PcaWorkout test passes

* Added pigsty api

* Fixed EntryPoint

* Fixed the arguments

* Fixed tests and added pigsty test

* Deleted Wrapped PCA transform

* Float -> float

* Cleaned docstrings

* Removed some unnecessary checks

* Simplified unnecessary code

* Moved some fields to ColumnInfo for simplifications

* Simplified weight columns

* Address PR comments #1

* Addressed PR comments #2

* Moved the static test

* PR comments #3

* Moved schema related information out of ColumnInfo and into Mapper.ColumnSchemaInfo.

* PR comments

* PR comments

* Updated manifest for entrypoint PcaCalculator

* Fixed schema exceptions
justinormont pushed a commit that referenced this pull request Nov 20, 2018
* Implement VBuffer master plan WIP #1

* Getting everything to build and tests passing

* Keep moving to the master plan of VBuffer.

* Remove the rest of the VBuffer.Count usages in ML.Data

* Remove the rest of the VBuffer.Count usages and make VBuffer.Count private.

* Fix two failing tests.

* Fix FastTreeBinaryClassificationCategoricalSplitTest by remembering the underlying arrays in the column buffer in Transposer.

Also enable a Transposer test, since it passes.
justinormont pushed a commit that referenced this pull request Apr 15, 2019
justinormont pushed a commit that referenced this pull request Jul 11, 2020
* Draft PR for SrCnn batch detection API interface (#1)

* POC Batch transform

* SrCnn batch interface

* Removed comment

* Handled some APIreview comments.

* Handled other review comments.

* Resolved review comments. Added sample.

Co-authored-by: Yael Dekel <yaeld@microsoft.com>

* Implement SrCnn entire API by function

* Fix bugs and add test

* Resolve comments

* Change names and add documentation

* Handling review comments

* Resolve the array allocating issue

* Move modeler initializing to CreateBatch and other minor fix.

* Fix 3 remaining comments

* Fixed code analysis issue.

* Fixed minor comments

Co-authored-by: klausmh <klausmh@microsoft.com>
Co-authored-by: Yael Dekel <yaeld@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.