Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Native featurizers for AutoML #317

Merged
merged 87 commits into from
Oct 9, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
ebf03e9
Draft, adding CategoryImputer, ToKeyImputer, ToString transformers
ganik Aug 23, 2019
600f5a5
Merge branch 'master' into ganik/aml
ganik Aug 23, 2019
ac12b2f
add tests
ganik Aug 23, 2019
62d3874
prelim commit
ganik Aug 24, 2019
2dec203
update manifest, fix unit tests/examples
ganik Aug 28, 2019
bb9dd73
upgrade version
ganik Aug 30, 2019
4034c8a
fix tests
ganik Aug 30, 2019
deecf05
temp hack fix for native libs
ganik Aug 30, 2019
e2ef393
copy libFeaturizers.so
ganik Aug 30, 2019
6c85423
fix version
ganik Aug 30, 2019
7fc9e08
fix cp
ganik Aug 30, 2019
2dff122
fix version
ganik Aug 30, 2019
ab32ad1
Update ML.Net version number.
Sep 4, 2019
96904c4
Update the examples and unit tests.
Sep 4, 2019
74df1e7
Update to latest version of the Featurizers library.
Sep 4, 2019
b310fc2
Fix test_tostring unit test.
Sep 4, 2019
f10ac43
Temporarily skip the estimator checks unit tests.
Sep 4, 2019
e13263a
Upgrade pip to the latest version when installing the Python
Sep 5, 2019
fe37274
Update test_estimator_checks for the three new transformers.
Sep 6, 2019
c928293
Remove extra comma from test_estimator_checks.
Sep 6, 2019
4d6807e
Update the ML.Net version.
Sep 6, 2019
f3e8417
Merge pull request #4 from pieths/aml
ganik Sep 6, 2019
7f20259
Merge branch 'master' into ganik/aml
ganik Sep 6, 2019
9c23cc4
Merge branch 'ganik/aml' of https://github.com/ganik/NimbusML into ga…
ganik Sep 11, 2019
46b5716
Merge branch 'master' into ganik/aml
ganik Sep 11, 2019
2c4630c
Add TimeSeriesImputer
ganik Sep 11, 2019
03a218c
Add country param to DateTimeSplitter
ganik Sep 17, 2019
1454256
Upgrade TensorFlow.NET version. Required by latest version of Microso…
Sep 18, 2019
e0bf89e
Update ML.Net version and import new AutoMLFeaturizers package.
Sep 18, 2019
a69cf6e
Add back in the accidentally removed tests from test_data_with_missin…
Sep 18, 2019
b1a8073
Update the DateTimeSplitter examples.
Sep 18, 2019
2d89f49
Update the ToKeyImputer examples.
Sep 18, 2019
d434492
Update the ToString examples.
Sep 18, 2019
a726b23
Merge pull request #5 from pieths/aml
ganik Sep 19, 2019
b420f79
Merge branch 'master' into ganik/aml
ganik Sep 19, 2019
b3e78dd
Merge branch 'master' into ganik/aml
ganik Sep 19, 2019
088d437
Update build to support latest nuget packages and updates.
Sep 20, 2019
feef418
Remove copy of libFeaturizers from linux build script.
Sep 20, 2019
6b8c5c5
Merge pull request #6 from pieths/aml
ganik Sep 20, 2019
ecb5f3b
Add TimeSeriesImputer to the NimbusML project.
Sep 20, 2019
b1f473f
Merge pull request #7 from pieths/aml
ganik Sep 20, 2019
7f1917f
Add initial DataFrame based example for TimeSeriesImputer.
Sep 23, 2019
d4a78ae
Update to the latest version of manifest.json.
Sep 23, 2019
ed9ec73
Add missing project include for the TimeSeriesImputer example.
Sep 23, 2019
eb36294
Update the DateTimeSplitter examples.
Sep 23, 2019
15d026c
Update build files to copy over the Data folder which is required for…
Sep 23, 2019
4904acf
Add a unit test for testing the holiday name return value for DateTim…
Sep 23, 2019
5edf0d8
Add unit test for ToKeyImputer.
Sep 23, 2019
a775a92
Update to latest version of manifest.json. Makes grain input required…
Sep 23, 2019
34f6ba6
Update TimeSeriesImputer_df example.
Sep 23, 2019
18ee975
Remove TimeSeriesImputer from test_estimator_checks.
Sep 23, 2019
7b05312
Update nuget.config to point to relative directory for ml.net packages.
Sep 24, 2019
b0fb48d
Add unit test for TimeSeriesImputer.
Sep 24, 2019
f62c17a
Use environmental variable to specify the local ml.net nuget package …
Sep 24, 2019
a4cf299
Update to the latest version of ml.net.
Sep 24, 2019
db36336
Add latest version of nuget packages for building.
Sep 24, 2019
9ef9baf
Merge pull request #8 from pieths/aml
ganik Sep 24, 2019
913e785
Merge branch 'master' into ganik/aml
ganik Sep 24, 2019
a697082
Update to the latest windows ml.net binaries.
Sep 24, 2019
5c8fbc4
Add linux ml.net binaries.
Sep 24, 2019
e3b5473
Merge pull request #9 from pieths/aml
ganik Sep 24, 2019
13cb163
adding correct nuget packages/location
michaelgsharp Sep 24, 2019
5b2fc0c
adding correct ML.NET signed packages
michaelgsharp Sep 24, 2019
765164f
adding correct ML.NET signed packages
michaelgsharp Sep 24, 2019
b9da669
Merge pull request #10 from michaelgsharp/ganik/aml
ganik Sep 24, 2019
3d5a973
Merge branch 'master' into ganik/aml
ganik Sep 24, 2019
6b24db6
Merge branch 'master' into ganik/aml
ganik Sep 26, 2019
afcdda1
Update the referenced ML.Net versions.
Oct 4, 2019
28c8e16
Update to the latest version of the manifest.
Oct 4, 2019
7910a8e
Add RobustScaler to the public API.
Oct 4, 2019
678ce5b
Fix spacing bug in RobustScalar in manifest.json.
Oct 4, 2019
a79ba3e
Merge branch 'upstream-master' into aml
Oct 4, 2019
b3e61be
Merge pull request #13 from pieths/aml
ganik Oct 4, 2019
3c08c41
Merge branch 'master' into ganik/aml
ganik Oct 4, 2019
cbae53a
Update to the latest version of manifest.json which contains naming f…
Oct 7, 2019
76b00c3
Merge pull request #14 from pieths/aml
ganik Oct 7, 2019
3cc1c75
Update to latest unsigned nuget packages for testing RobustScaler and…
Oct 8, 2019
907f6b5
Add RobustScaler unit tests and examples.
Oct 8, 2019
f0a3c95
Merge branch 'upstream-master' into aml
Oct 8, 2019
ebb1c7f
Merge pull request #15 from pieths/aml
ganik Oct 8, 2019
f8d1d9e
Update to the latest signed ML.Net nugets.
Oct 8, 2019
5e839d3
Merge pull request #16 from pieths/aml
ganik Oct 8, 2019
aeb8e6e
Merge branch 'master' into ganik/aml
ganik Oct 9, 2019
0952d52
Fix RobustScaler checks in test_estimator_checks.
Oct 9, 2019
90ff473
Merge pull request #17 from pieths/aml
ganik Oct 9, 2019
31416f3
Merge branch 'master' into ganik/aml
ganik Oct 9, 2019
932ae12
up version
ganik Oct 9, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions build.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,8 @@ if "%AzureBuild%" == "True" (
echo ##vso[task.prependpath]%_dotnetRoot%
)

set LOCAL_NUGET_PACKAGES_DIR=.\local-nuget-packages

:: Build managed code
echo ""
echo "#################################"
Expand Down Expand Up @@ -311,6 +313,7 @@ copy "%BuildOutputDir%%Configuration%\pybridge.pyd" "%__currentScriptDir%src\py

if %PythonVersion% == 2.7 (
copy "%BuildOutputDir%%Configuration%\Platform\win-x64\publish\*.dll" "%__currentScriptDir%src\python\nimbusml\internal\libs\"
xcopy /S /E /I "%BuildOutputDir%%Configuration%\Platform\win-x64\publish\Data" "%__currentScriptDir%src\python\nimbusml\internal\libs\Data"
:: remove dataprep dlls as its not supported in python 2.7
del "%__currentScriptDir%src\python\nimbusml\internal\libs\Microsoft.DPrep.*"
del "%__currentScriptDir%src\python\nimbusml\internal\libs\Microsoft.Data.*"
Expand All @@ -321,6 +324,7 @@ if %PythonVersion% == 2.7 (
del "%__currentScriptDir%src\python\nimbusml\internal\libs\Microsoft.Workbench.Messaging.SDK.dll"
) else (
for /F "tokens=*" %%A in (build/libs_win.txt) do copy "%BuildOutputDir%%Configuration%\Platform\win-x64\publish\%%A" "%__currentScriptDir%src\python\nimbusml\internal\libs\"
xcopy /S /E /I "%BuildOutputDir%%Configuration%\Platform\win-x64\publish\Data" "%__currentScriptDir%src\python\nimbusml\internal\libs\Data"
)

if "%DebugBuild%" == "True" (
Expand Down
4 changes: 4 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,8 @@ then
echo "Installing dotnet SDK ... "
curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin -Version 2.1.701 -InstallDir ./cli

export LOCAL_NUGET_PACKAGES_DIR=./local-nuget-packages

# Build managed code
echo "Building managed code ... "
_dotnet="${__currentScriptDir}/cli/dotnet"
Expand Down Expand Up @@ -213,6 +215,7 @@ then
cp "${BuildOutputDir}/${__configuration}/Platform/${PublishDir}"/publish/System.Native.a "${__currentScriptDir}/src/python/nimbusml/internal/libs/"
cp "${BuildOutputDir}/${__configuration}/Platform/${PublishDir}"/publish/createdump "${__currentScriptDir}/src/python/nimbusml/internal/libs/" || :
cp "${BuildOutputDir}/${__configuration}/Platform/${PublishDir}"/publish/sosdocsunix.txt "${__currentScriptDir}/src/python/nimbusml/internal/libs/"
cp -r "${BuildOutputDir}/${__configuration}/Platform/${PublishDir}"/publish/Data "${__currentScriptDir}/src/python/nimbusml/internal/libs/."
ext=*.so
if [ "$(uname -s)" = "Darwin" ]
then
Expand Down Expand Up @@ -241,6 +244,7 @@ then
cat build/${libs_txt} | while read i; do
cp "${BuildOutputDir}/${__configuration}/Platform/${PublishDir}"/publish/$i "${__currentScriptDir}/src/python/nimbusml/internal/libs/"
done
cp -r "${BuildOutputDir}/${__configuration}/Platform/${PublishDir}"/publish/Data "${__currentScriptDir}/src/python/nimbusml/internal/libs/."
fi

if [[ $__configuration = Dbg* ]]
Expand Down
1 change: 1 addition & 0 deletions build/libs_linux.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
Newtonsoft.Json.dll
libCpuMathNative.so
libFastTreeNative.so
libFeaturizers.so
libLdaNative.so
libMklImports.so
libMklProxyNative.so
Expand Down
1 change: 1 addition & 0 deletions build/libs_mac.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ lib_lightgbm.dylib
libtensorflow.dylib
libonnxruntime.dylib
libtensorflow_framework.1.dylib
Featurizers.dll
System.Drawing.Common.dll
TensorFlow.NET.dll
NumSharp.Core.dll
Expand Down
1 change: 1 addition & 0 deletions build/libs_win.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ libiomp5md.dll
MklImports.dll
MklProxyNative.dll
SymSgdNative.dll
Featurizers.dll
tensorflow.dll
TensorFlow.NET.dll
NumSharp.Core.dll
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
3 changes: 2 additions & 1 deletion nuget.config
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
</config>
<packageSources>
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" />
<add key="MlNet_Daily" value="https://dotnet.myget.org/F/dotnet-core/api/v3/index.json" />
<!--add key="MlNet_Daily" value="https://dotnet.myget.org/F/dotnet-core/api/v3/index.json" /-->
<add key="local_packages" value="%LOCAL_NUGET_PACKAGES_DIR%" />
</packageSources>
</configuration>
3 changes: 3 additions & 0 deletions src/DotNetBridge/Bridge.cs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
using System.Runtime.InteropServices;
using System.Text;
using System.Threading;
using Microsoft.ML;
using Microsoft.ML.Featurizers;
using Microsoft.ML.Data;
using Microsoft.ML.EntryPoints;
using Microsoft.ML.Runtime;
Expand Down Expand Up @@ -300,6 +302,7 @@ private static unsafe int GenericExec(EnvironmentBlock* penv, sbyte* psz, int cd
//env.ComponentCatalog.RegisterAssembly(typeof(TimeSeriesProcessingEntryPoints).Assembly);
//env.ComponentCatalog.RegisterAssembly(typeof(ParquetLoader).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(SsaChangePointDetector).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(CategoryImputerTransformer).Assembly);
env.ComponentCatalog.RegisterAssembly(typeof(DotNetBridgeEntrypoints).Assembly);

using (var ch = host.Start("Executing"))
Expand Down
24 changes: 13 additions & 11 deletions src/DotNetBridge/DotNetBridge.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,19 @@
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers</IncludeAssets>
</PackageReference>
<PackageReference Include="Microsoft.ML" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.CpuMath" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.EntryPoints" Version="0.16.0-preview2" />
<PackageReference Include="Microsoft.ML.Mkl.Components" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.LightGBM" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.TensorFlow" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.Dnn" Version="0.16.0-preview2" />
<PackageReference Include="Microsoft.ML.Ensemble" Version="0.16.0-preview2" />
<PackageReference Include="Microsoft.ML.TimeSeries" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.CpuMath" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.EntryPoints" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Mkl.Components" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.LightGBM" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.TensorFlow" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Dnn" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Ensemble" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.TimeSeries" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Featurizers" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="MicrosoftMLFeaturizers" Version="0.1.0" />
<PackageReference Include="Microsoft.DataPrep" Version="0.0.1.12-preview" />
<PackageReference Include="TensorFlow.NET" Version="0.11.3" />
<PackageReference Include="SciSharp.TensorFlow.Redist" Version="1.14.0" />
Expand Down
24 changes: 13 additions & 11 deletions src/Platforms/build.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,19 @@
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.ML" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.CpuMath" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.EntryPoints" Version="0.16.0-preview2" />
<PackageReference Include="Microsoft.ML.Mkl.Components" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.LightGBM" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.TensorFlow" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML.Dnn" Version="0.16.0-preview2" />
<PackageReference Include="Microsoft.ML.Ensemble" Version="0.16.0-preview2" />
<PackageReference Include="Microsoft.ML.TimeSeries" Version="1.4.0-preview2" />
<PackageReference Include="Microsoft.ML" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.CpuMath" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.EntryPoints" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Mkl.Components" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.LightGBM" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.TensorFlow" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Dnn" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Ensemble" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.TimeSeries" Version="1.6.2-preview2-28208-8" />
<PackageReference Include="Microsoft.ML.Featurizers" Version="0.18.2-preview2-28208-8" />
<PackageReference Include="MicrosoftMLFeaturizers" Version="0.1.0" />
<PackageReference Include="Microsoft.DataPrep" Version="0.0.1.12-preview" />
<PackageReference Include="TensorFlow.NET" Version="0.11.3" />
<PackageReference Include="SciSharp.TensorFlow.Redist" Version="1.14.0" />
Expand Down
Loading