v0.13
New Functionality:
-
Export trained LightGBM models for evaluation outside of Spark
-
LightGBM on Spark supports multiple cores per executor
-
CNTKModel
works with multi-input multi-output models of any CNTK
datatype -
Added Minibatching and Flattening transformers for adding flexible
batching logic to pipelines, deep networks, and web clients. -
Added
Benchmark
test API for tracking model performance across
versions -
Added
PartitionConsolidator
function for aggregating streaming data
onto one partition per executor (for use with connection/rate-limited
HTTP services)
Updates and Improvements:
-
Updated to Spark 2.3.0
-
Added Databricks notebook tests to build system
-
CNTKModel
uses significantly less memory -
Simplified example notebooks
-
Simplified APIs for MMLSpark Serving
-
Simplified APIs for CNTK on Spark
-
LightGBM stability improvements
-
ComputeModelStatistics
stability improvements
Acknowledgements:
We would like to acknowledge the external contributors who helped create
this version of MMLSpark (in order of commit history):
- 严伟, @terrytangyuan, @ywskycn, @dvanasseldonk, Jilong Liao,
@chappers, @ekaterina-sereda-rf