Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Upgrade to ML.NET version 1.0.0 #100

Merged
merged 80 commits into from
May 27, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
b260bd1
ref v0.10 ML.NET
ganik Feb 19, 2019
d958ddb
fix build
ganik Mar 1, 2019
abb082a
hook up to v0.11.0 ML.NET
ganik Mar 12, 2019
a530a19
fix build errors
ganik Mar 12, 2019
aeb0377
fix build
ganik Mar 13, 2019
6eba3ee
include Microsoft.Data.DataView.dll in build
ganik Mar 13, 2019
dd7c9f1
typo
ganik Mar 13, 2019
8a3e682
remove protobuf dll
ganik Mar 13, 2019
821c08a
Regenerate code due to manifest changes
ganik Mar 18, 2019
588ead8
fix missing ep
ganik Mar 18, 2019
abd541f
Update to ML.NET 1.0.0-preview
ganik Apr 3, 2019
d447aec
fix .net build
ganik Apr 3, 2019
70d6fef
update nuget for ML.NET
ganik Apr 5, 2019
a78227e
remove Data namespace dll
ganik Apr 5, 2019
d385780
rollback nuget changes
ganik Apr 5, 2019
25b81f7
move to final RC ML.NET
ganik Apr 29, 2019
60da070
Merge branch 'master' into ganik/v11
ganik Apr 29, 2019
49b8673
Regenerate classes as per updated manifest
ganik Apr 29, 2019
6a91319
fix maximum_number_of_iterations param name
ganik Apr 29, 2019
8eecfa5
fix parameter names
ganik May 15, 2019
c20436e
Merge branch 'master' into ganik/v11
ganik May 15, 2019
d972907
fix names
ganik May 15, 2019
f43ba06
Merge branch 'ganik/v11' of https://github.com/ganik/NimbusML into ga…
ganik May 15, 2019
8bb2d50
reference official v1.0 of ML.NET
ganik May 18, 2019
da8f247
fix tests
ganik May 19, 2019
47b9aaa
fix label column
ganik May 19, 2019
df963f8
Fix tests
ganik May 19, 2019
7c81b4b
fix lightgbm tests
ganik May 19, 2019
4ba9154
fix OLS
ganik May 19, 2019
eae45a3
fix tests
ganik May 19, 2019
afb94c5
fix more tests
ganik May 19, 2019
cfaa4fd
fix more tests
ganik May 19, 2019
2c294d6
fix weight column name
ganik May 19, 2019
34f9124
more tests
ganik May 19, 2019
48c464d
fix normalized metrics
ganik May 19, 2019
2f45cde
more errors
ganik May 19, 2019
29a7f98
Fix CV
ganik May 19, 2019
298a66c
rename feature_column to feature_column_name
ganik May 19, 2019
89afa9e
fix cv ranker
ganik May 19, 2019
161b8de
Fix lightgbm tests
ganik May 22, 2019
c2852d3
fix changes due to upgrade of NGramFeaturizer
ganik May 22, 2019
cb3d36b
fix ngram featurizer
ganik May 22, 2019
024c5bc
fix FactorizationMachine assert error
ganik May 22, 2019
35e66e7
disable test which is not working now due to change in LightGbm version
ganik May 22, 2019
ce5e462
fix model name
ganik May 23, 2019
02a6d3e
typo
ganik May 23, 2019
68df630
handle nan in arrays
ganik May 23, 2019
36d55b2
fix tests
ganik May 23, 2019
eecf057
fix tests
ganik May 24, 2019
8307a98
fix more tests
ganik May 24, 2019
694f45d
fix data type
ganik May 24, 2019
a8325ae
fix AUC exception
ganik May 24, 2019
49c42d6
kick the build
ganik May 24, 2019
72efea0
fix tests due to data change
ganik May 24, 2019
b0f7d0b
fix ngram test
ganik May 24, 2019
9d3b8fb
fix mutual info tests
ganik May 24, 2019
decf18f
copy libiomp lib
ganik May 24, 2019
a1edbdb
fix mac build
ganik May 24, 2019
3c550de
disable SymSgdNative for now
ganik May 24, 2019
cd2934a
disable SymSgdBinary classifier tests for Linux
ganik May 24, 2019
0a8b281
fix linux tests
ganik May 25, 2019
3f3fc2c
fix linux tests
ganik May 25, 2019
f04388e
try linux
ganik May 25, 2019
a62e91d
fix linux
ganik May 25, 2019
69a8067
skip SymSgdBinaryClassifier checks
ganik May 25, 2019
bf84317
fix entrypoint compiler
ganik May 25, 2019
7c0def9
fix entry point generation
ganik May 25, 2019
f73565a
fix example tests run
ganik May 25, 2019
a249563
fix typo
ganik May 25, 2019
4f1d94d
fix documentation regression
ganik May 25, 2019
458e77b
fix parameter name
ganik May 25, 2019
fba52a4
fix examples
ganik May 26, 2019
6e330b9
fix examples
ganik May 26, 2019
20ae0cb
fix tests
ganik May 26, 2019
6e9141c
fix tests
ganik May 26, 2019
0153687
fix linux
ganik May 26, 2019
7fed56e
kick build
ganik May 26, 2019
bf9ce19
Fix code_fixer
ganik May 26, 2019
63a64c8
fix skip take filters
ganik May 27, 2019
4bc8fd3
fix estimator checks
ganik May 27, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion build.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ if /i [%1] == [--skipDotNetBridge] (
echo "Usage: build.cmd [--configuration <Configuration>] [--runTests] [--buildDotNetBridgeOnly] [--skipDotNetBridge]"
echo ""
echo "Options:"
echo " --configuration <Configuration> Build Configuration (DbgWinPy3.6,DbgWinPy3.5,DbgWinPy2.7,RlsWinPy3.6,RlsWinPy3.5,RlsWinPy2.7)"
echo " --configuration <Configuration> Build Configuration (DbgWinPy3.7,DbgWinPy3.6,DbgWinPy3.5,DbgWinPy2.7,RlsWinPy3.7,RlsWinPy3.6,RlsWinPy3.5,RlsWinPy2.7)"
echo " --runTests Run tests after build"
echo " --buildDotNetBridgeOnly Build only DotNetBridge"
echo " --skipDotNetBridge Build everything except DotNetBridge"
Expand Down
2 changes: 1 addition & 1 deletion build/ci/phase-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ phases:
- script: $(_buildScript) --configuration $(_configuration) --runTests
# Mac phases
- ${{ if eq(parameters.name, 'Mac') }}:
- script: brew update && brew install libomp mono-libgdiplus gettext && brew link gettext --force
- script: brew update && brew install https://raw.githubusercontent.com/Homebrew/homebrew-core/f5b1ac99a7fba27c19cee0bc4f036775c889b359/Formula/libomp.rb mono-libgdiplus gettext && brew link gettext --force
- ${{ if eq(parameters.testDistro, 'noTests') }}:
- script: chmod 777 $(_buildScript) && $(_buildScript) --configuration $(_configuration)
- ${{ if eq(parameters.testDistro, '') }}:
Expand Down
1 change: 0 additions & 1 deletion build/libs_linux.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
Google.Protobuf.dll
Newtonsoft.Json.dll
libCpuMathNative.so
libFactorizationMachineNative.so
Expand Down
1 change: 0 additions & 1 deletion build/libs_mac.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
Google.Protobuf.dll
Newtonsoft.Json.dll
libCpuMathNative.dylib
libFactorizationMachineNative.dylib
Expand Down
1 change: 1 addition & 0 deletions build/libs_win.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ FactorizationMachineNative.dll
FastTreeNative.dll
LdaNative.dll
lib_lightgbm.dll
libiomp5md.dll
MklImports.dll
SymSgdNative.dll
tensorflow.dll
Expand Down
187 changes: 94 additions & 93 deletions src/DotNetBridge/Bridge.cs
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,12 @@
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.EntryPoints;
using Microsoft.ML.ImageAnalytics;
using Microsoft.ML.LightGBM;
using Microsoft.ML.Model.Onnx;
using Microsoft.ML.Model.OnnxConverter;
using Microsoft.ML.Runtime;
using Microsoft.ML.Trainers;
using Microsoft.ML.Trainers.Ensemble;
using Microsoft.ML.Trainers.FastTree;
using Microsoft.ML.Trainers.KMeans;
using Microsoft.ML.Trainers.PCA;
using Microsoft.ML.Trainers.SymSgd;
using Microsoft.ML.Trainers.LightGbm;
using Microsoft.ML.Transforms;

namespace Microsoft.MachineLearning.DotNetBridge
Expand Down Expand Up @@ -307,107 +305,110 @@ private static unsafe IntPtr GetFn(FnId id)
/// </summary>
private static unsafe int GenericExec(EnvironmentBlock* penv, sbyte* psz, int cdata, DataSourceBlock** ppdata)
{
using (var env = new RmlEnvironment(MarshalDelegate<CheckCancelled>(penv->checkCancel), penv->seed,
verbose: penv != null && penv->verbosity > 3, conc: penv != null ? penv->maxThreadsAllowed : 0))
var env = new RmlEnvironment(MarshalDelegate<CheckCancelled>(penv->checkCancel), penv->seed, verbose: penv != null && penv->verbosity > 3);
var host = env.Register("ML.NET_Execution");

env.ComponentCatalog.RegisterAssembly(typeof(TextLoader).Assembly); // ML.Data
env.ComponentCatalog.RegisterAssembly(typeof(LinearModelParameters).Assembly); // ML.StandardLearners
env.ComponentCatalog.RegisterAssembly(typeof(CategoricalCatalog).Assembly); // ML.Transforms
env.ComponentCatalog.RegisterAssembly(typeof(FastTreeRegressionTrainer).Assembly); // ML.FastTree

//env.ComponentCatalog.RegisterAssembly(typeof(EnsembleModelParameters).Assembly); // ML.Ensemble
env.ComponentCatalog.RegisterAssembly(typeof(KMeansModelParameters).Assembly); // ML.KMeansClustering
env.ComponentCatalog.RegisterAssembly(typeof(PcaModelParameters).Assembly); // ML.PCA
env.ComponentCatalog.RegisterAssembly(typeof(CVSplit).Assembly); // ML.EntryPoints

env.ComponentCatalog.RegisterAssembly(typeof(OlsModelParameters).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(LightGbmBinaryModelParameters).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(TensorFlowTransformer).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(SymSgdClassificationTrainer).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(AutoInference).Assembly); // ML.PipelineInference
env.ComponentCatalog.RegisterAssembly(typeof(DataViewReference).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(ImageLoadingTransformer).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(SaveOnnxCommand).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(TimeSeriesProcessingEntryPoints).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(ParquetLoader).Assembly);

using (var ch = host.Start("Executing"))
{
var host = env.Register("ML.NET_Execution");
env.ComponentCatalog.RegisterAssembly(typeof(TextLoader).Assembly); // ML.Data
env.ComponentCatalog.RegisterAssembly(typeof(StochasticGradientDescentClassificationTrainer).Assembly); // ML.StandardLearners
env.ComponentCatalog.RegisterAssembly(typeof(CategoricalCatalog).Assembly); // ML.Transforms
env.ComponentCatalog.RegisterAssembly(typeof(FastTreeRegressionTrainer).Assembly); // ML.FastTree
env.ComponentCatalog.RegisterAssembly(typeof(KMeansPlusPlusTrainer).Assembly); // ML.KMeansClustering
env.ComponentCatalog.RegisterAssembly(typeof(RandomizedPcaTrainer).Assembly); // ML.PCA
//env.ComponentCatalog.RegisterAssembly(typeof(Experiment).Assembly); // ML.Legacy
env.ComponentCatalog.RegisterAssembly(typeof(LightGbmRegressorTrainer).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(TensorFlowTransformer).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(ImageLoaderTransformer).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(SymSgdClassificationTrainer).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(AutoInference).Assembly); // ML.PipelineInference
env.ComponentCatalog.RegisterAssembly(typeof(OnnxExportExtensions).Assembly); // ML.Onnx
env.ComponentCatalog.RegisterAssembly(typeof(DataViewReference).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(EnsemblePredictor).Assembly); // // ML.Ensemble BUG https://github.com/dotnet/machinelearning/issues/1078 Ensemble isn't in a NuGet package

using (var ch = host.Start("Executing"))
var sw = new System.Diagnostics.Stopwatch();
sw.Start();
try
{
var sw = new System.Diagnostics.Stopwatch();
sw.Start();
try
{
// code, pszIn, and pszOut can be null.
ch.Trace("Checking parameters");
// code, pszIn, and pszOut can be null.
ch.Trace("Checking parameters");

host.CheckParam(penv != null, nameof(penv));
host.CheckParam(penv->messageSink != null, "penv->message");
host.CheckParam(penv != null, nameof(penv));
host.CheckParam(penv->messageSink != null, "penv->message");

host.CheckParam(psz != null, nameof(psz));
host.CheckParam(psz != null, nameof(psz));

ch.Trace("Converting graph operands");
var graph = BytesToString(psz);
ch.Trace("Converting graph operands");
var graph = BytesToString(psz);

ch.Trace("Wiring message sink");
var message = MarshalDelegate<MessageSink>(penv->messageSink);
var messageValidator = new MessageValidator(host);
var lk = new object();
Action<IMessageSource, ChannelMessage> listener =
(sender, msg) =>
ch.Trace("Wiring message sink");
var message = MarshalDelegate<MessageSink>(penv->messageSink);
var messageValidator = new MessageValidator(host);
var lk = new object();
Action<IMessageSource, ChannelMessage> listener =
(sender, msg) =>
{
byte[] bs = StringToNullTerminatedBytes(sender.FullName);
string m = messageValidator.Validate(msg);
if (!string.IsNullOrEmpty(m))
{
byte[] bs = StringToNullTerminatedBytes(sender.FullName);
string m = messageValidator.Validate(msg);
if (!string.IsNullOrEmpty(m))
byte[] bm = StringToNullTerminatedBytes(m);
lock (lk)
{
byte[] bm = StringToNullTerminatedBytes(m);
lock (lk)
{
fixed (byte* ps = bs)
fixed (byte* pm = bm)
message(penv, msg.Kind, (sbyte*)ps, (sbyte*)pm);
}
fixed (byte* ps = bs)
fixed (byte* pm = bm)
message(penv, msg.Kind, (sbyte*)ps, (sbyte*)pm);
}
};
env.AddListener(listener);
}
};
env.AddListener(listener);

host.CheckParam(cdata >= 0, nameof(cdata), "must be non-negative");
host.CheckParam(ppdata != null || cdata == 0, nameof(ppdata));
for (int i = 0; i < cdata; i++)
host.CheckParam(cdata >= 0, nameof(cdata), "must be non-negative");
host.CheckParam(ppdata != null || cdata == 0, nameof(ppdata));
for (int i = 0; i < cdata; i++)
{
var pdata = ppdata[i];
host.CheckParam(pdata != null, "pdata");
host.CheckParam(0 <= pdata->ccol && pdata->ccol <= int.MaxValue, "ccol");
host.CheckParam(0 <= pdata->crow && pdata->crow <= long.MaxValue, "crow");
if (pdata->ccol > 0)
{
var pdata = ppdata[i];
host.CheckParam(pdata != null, "pdata");
host.CheckParam(0 <= pdata->ccol && pdata->ccol <= int.MaxValue, "ccol");
host.CheckParam(0 <= pdata->crow && pdata->crow <= long.MaxValue, "crow");
if (pdata->ccol > 0)
{
host.CheckParam(pdata->names != null, "names");
host.CheckParam(pdata->kinds != null, "kinds");
host.CheckParam(pdata->keyCards != null, "keyCards");
host.CheckParam(pdata->vecCards != null, "vecCards");
host.CheckParam(pdata->getters != null, "getters");
}
host.CheckParam(pdata->names != null, "names");
host.CheckParam(pdata->kinds != null, "kinds");
host.CheckParam(pdata->keyCards != null, "keyCards");
host.CheckParam(pdata->vecCards != null, "vecCards");
host.CheckParam(pdata->getters != null, "getters");
}
}

ch.Trace("Validating number of data sources");
ch.Trace("Validating number of data sources");

// Wrap the data sets.
ch.Trace("Wrapping native data sources");
ch.Trace("Executing");
ExecCore(penv, host, ch, graph, cdata, ppdata);
}
catch (Exception e)
{
// Dump the exception chain.
var ex = e;
while (ex.InnerException != null)
ex = ex.InnerException;
ch.Error("*** {1}: '{0}'", ex.Message, ex.GetType());
return -1;
}
finally
{
sw.Stop();
if (penv != null && penv->verbosity > 0)
ch.Info("Elapsed time: {0}", sw.Elapsed);
else
ch.Trace("Elapsed time: {0}", sw.Elapsed);
}
// Wrap the data sets.
ch.Trace("Wrapping native data sources");
ch.Trace("Executing");
ExecCore(penv, host, ch, graph, cdata, ppdata);
}
catch (Exception e)
{
// Dump the exception chain.
var ex = e;
while (ex.InnerException != null)
ex = ex.InnerException;
ch.Error("*** {1}: '{0}'", ex.Message, ex.GetType());
return -1;
}
finally
{
sw.Stop();
if (penv != null && penv->verbosity > 0)
ch.Info("Elapsed time: {0}", sw.Elapsed);
else
ch.Trace("Elapsed time: {0}", sw.Elapsed);
}
}
return 0;
Expand Down
18 changes: 10 additions & 8 deletions src/DotNetBridge/DotNetBridge.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,15 @@
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers</IncludeAssets>
</PackageReference>
<PackageReference Include="Microsoft.ML" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML.CpuMath" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML.EntryPoints" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML.HalLearners" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML.LightGBM" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML.Onnx" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML.TensorFlow" Version="0.10.0-preview-27310-10" />
<PackageReference Include="Microsoft.ML" Version="1.0.0" />
<PackageReference Include="Microsoft.ML.CpuMath" Version="1.0.0" />
<PackageReference Include="Microsoft.ML.EntryPoints" Version="0.12.0" />
<PackageReference Include="Microsoft.ML.Mkl.Components" Version="1.0.0" />
<PackageReference Include="Microsoft.ML.Mkl.Redist" Version="1.0.0" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.0.0" />
<PackageReference Include="Microsoft.ML.LightGBM" Version="1.0.0" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="0.12.0" />
<PackageReference Include="Microsoft.ML.TensorFlow" Version="0.12.0" />
<PackageReference Include="Microsoft.ML.Ensemble" Version="0.12.0" />
</ItemGroup>
</Project>
2 changes: 1 addition & 1 deletion src/DotNetBridge/MessageValidator.cs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

using System;
using System.Globalization;
using Microsoft.ML;
using Microsoft.ML.Runtime;

namespace Microsoft.MachineLearning.DotNetBridge
{
Expand Down
Loading