New functionality:
* We are now uploading MMLSpark as a "Azure/mmlspark" spark package.
Use `--packages Azure:mmlspark:0.8` with the Spark command-line tools.
* Add a bi-directional LSTM medical entity extractor to the
`ModelDownloader`, and new jupyter notebook for medical entity
extraction using NLTK, PubMed Word embeddings, and the Bi-LSTM.
* Add `ImageSetAugmenter` for easy dataset augmentation within image
processing pipelines.
Improvements:
* Optimize the performance of `CNTKModel`. It now broadcasts a loaded
model to workers and shares model weights between partitions on the
same worker. Minibatch padding (an internal workaround of a CNTK bug)
is now no longer used, eliminating excess computations when there is a
mismatch between the partition size and minibatch size.
* Bugfix: CNTKModel can work with models with unnamed outputs.
Docker image improvements:
* Environment variables are now part of the docker image (in addition to
being set in bash).
* New docker images:
- `microsoft/mmlspark:latest`: plain image, as always,
- `microsoft/mmlspark:gpu`: GPU variant based on an `nvidia/cuda` image.
- `microsoft/mmlspark:plus` and `microsoft/mmlspark:plus-gpu`: these
images contain additional packages for internal use; they will
probably be based on an older Conda version too in future releases.
Updates:
* The Conda environment now includes NLTK.
* Updated Java and SBT versions.