Skip to content

Commit 7eed833

Browse files
committed
Bump version to v0.14
1 parent 25f3601 commit 7eed833

File tree

4 files changed

+32
-27
lines changed

4 files changed

+32
-27
lines changed

README.md

Lines changed: 23 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,20 @@ Microsoft Machine Learning for Apache Spark
55
<img title="Build Status" align="right"
66
src="https://mmlspark.azureedge.net/icons/BuildStatus.svg" />
77

8-
MMLSpark provides a number of deep learning and data science tools for [Apache
9-
Spark](https://github.com/apache/spark), including seamless integration of
10-
Spark Machine Learning pipelines with [Microsoft Cognitive Toolkit
11-
(CNTK)](https://github.com/Microsoft/CNTK) and
12-
[OpenCV](http://www.opencv.org/), enabling you to quickly create powerful,
13-
highly-scalable predictive and analytical models for large image and text
14-
datasets.
15-
16-
MMLSpark requires Scala 2.11, Spark 2.1+, and either Python 2.7 or Python 3.5+.
8+
MMLSpark is an ecosytem of tools aimed to expand the distributed computing framework
9+
[Apache Spark](https://github.com/apache/spark) in several new directions.
10+
MMLSpark adds a number of deep learning and data science tools to the Spark ecosystem,
11+
including seamless integration of Spark Machine Learning pipelines with [Microsoft Cognitive Toolkit
12+
(CNTK)](https://github.com/Microsoft/CNTK), [LightGBM](https://github.com/Microsoft/LightGBM) and
13+
[OpenCV](http://www.opencv.org/). This enables powerful and highly-scalable predictive and analytical models
14+
for a variety of datasources.
15+
16+
MMLSpark also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users
17+
can embed **any** web service into their SparkML models. In this vein, MMLSpark provides easy to use
18+
SparkML transformers for a wide variety of [Microsoft Cognitive Services](https://azure.microsoft.com/en-us/services/cognitive-services/). For production grade deployment, the Spark Serving project enables high throughput,
19+
sub-millisecond latency web services, backed by your Spark cluster.
20+
21+
MMLSpark requires Scala 2.11, Spark 2.3+, and either Python 2.7 or Python 3.5+.
1722
See the API documentation [for
1823
Scala](http://mmlspark.azureedge.net/docs/scala/) and [for
1924
PySpark](http://mmlspark.azureedge.net/docs/pyspark/).
@@ -151,9 +156,9 @@ MMLSpark can be conveniently installed on existing Spark clusters via the
151156
`--packages` option, examples:
152157

153158
```bash
154-
spark-shell --packages Azure:mmlspark:0.13
155-
pyspark --packages Azure:mmlspark:0.13
156-
spark-submit --packages Azure:mmlspark:0.13 MyApp.jar
159+
spark-shell --packages Azure:mmlspark:0.14
160+
pyspark --packages Azure:mmlspark:0.14
161+
spark-submit --packages Azure:mmlspark:0.14 MyApp.jar
157162
```
158163

159164
This can be used in other Spark contexts too, for example, you can use MMLSpark
@@ -168,14 +173,14 @@ cloud](http://community.cloud.databricks.com), create a new [library from Maven
168173
coordinates](https://docs.databricks.com/user-guide/libraries.html#libraries-from-maven-pypi-or-spark-packages)
169174
in your workspace.
170175

171-
For the coordinates use: `Azure:mmlspark:0.13`. Ensure this library is
176+
For the coordinates use: `Azure:mmlspark:0.14`. Ensure this library is
172177
attached to all clusters you create.
173178

174179
Finally, ensure that your Spark cluster has at least Spark 2.1 and Scala 2.11.
175180

176181
You can use MMLSpark in both your Scala and PySpark notebooks. To get started with our example notebooks import the following databricks archive:
177182

178-
```https://mmlspark.blob.core.windows.net/dbcs/MMLSpark%20Examples%20v0.13.dbc```
183+
```https://mmlspark.blob.core.windows.net/dbcs/MMLSpark%20Examples%20v0.14.dbc```
179184

180185

181186
### Docker
@@ -208,7 +213,7 @@ the above example, or from python:
208213
```python
209214
import pyspark
210215
spark = pyspark.sql.SparkSession.builder.appName("MyApp") \
211-
.config("spark.jars.packages", "Azure:mmlspark:0.13") \
216+
.config("spark.jars.packages", "Azure:mmlspark:0.14") \
212217
.getOrCreate()
213218
import mmlspark
214219
```
@@ -224,7 +229,7 @@ running script actions, see [this
224229
guide](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-customize-cluster-linux#use-a-script-action-during-cluster-creation).
225230

226231
The script action url is:
227-
<https://mmlspark.azureedge.net/buildartifacts/0.13/install-mmlspark.sh>.
232+
<https://mmlspark.azureedge.net/buildartifacts/0.14/install-mmlspark.sh>.
228233

229234
If you're using the Azure Portal to run the script action, go to `Script
230235
actions``Submit new` in the `Overview` section of your cluster blade. In
@@ -240,7 +245,7 @@ your `build.sbt`:
240245

241246
```scala
242247
resolvers += "MMLSpark Repo" at "https://mmlspark.azureedge.net/maven"
243-
libraryDependencies += "com.microsoft.ml.spark" %% "mmlspark" % "0.13"
248+
libraryDependencies += "com.microsoft.ml.spark" %% "mmlspark" % "0.14"
244249
```
245250

246251
### Building from source
@@ -314,4 +319,4 @@ PMML](https://github.com/alipay/jpmml-sparkml-lightgbm)
314319

315320
*Apache®, Apache Spark, and Spark® are either registered trademarks or
316321
trademarks of the Apache Software Foundation in the United States and/or other
317-
countries.*
322+
countries.*

docs/R-setup.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ To install the current MMLSpark package for R use:
1010

1111
```R
1212
...
13-
devtools::install_url("https://mmlspark.azureedge.net/rrr/mmlspark-0.13.zip")
13+
devtools::install_url("https://mmlspark.azureedge.net/rrr/mmlspark-0.14.zip")
1414
...
1515
```
1616

@@ -23,7 +23,7 @@ It will take some time to install all dependencies. Then, run:
2323
library(sparklyr)
2424
library(dplyr)
2525
config <- spark_config()
26-
config$sparklyr.defaultPackages <- "Azure:mmlspark:0.13"
26+
config$sparklyr.defaultPackages <- "Azure:mmlspark:0.14"
2727
sc <- spark_connect(master = "local", config = config)
2828
...
2929
```
@@ -83,7 +83,7 @@ and then use spark_connect with method = "databricks":
8383

8484
```R
8585
install.packages("devtools")
86-
devtools::install_url("https://mmlspark.azureedge.net/rrr/mmlspark-0.13.zip")
86+
devtools::install_url("https://mmlspark.azureedge.net/rrr/mmlspark-0.14.zip")
8787
library(sparklyr)
8888
library(dplyr)
8989
sc <- spark_connect(method = "databricks")

docs/docker.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ You can now select one of the sample notebooks and run it, or create your own.
2929
In the above, `microsoft/mmlspark` specifies the project and image name that you
3030
want to run. There is another component implicit here which is the *tag* (=
3131
version) that you want to use — specifying it explicitly looks like
32-
`microsoft/mmlspark:0.13` for the `0.13` tag.
32+
`microsoft/mmlspark:0.14` for the `0.14` tag.
3333

3434
Leaving `microsoft/mmlspark` by itself has an implicit `latest` tag, so it is
3535
equivalent to `microsoft/mmlspark:latest`. The `latest` tag is identical to the
@@ -47,7 +47,7 @@ that you will probably want to use can look as follows:
4747
-e ACCEPT_EULA=y \
4848
-p 127.0.0.1:80:8888 \
4949
-v ~/myfiles:/notebooks/myfiles \
50-
microsoft/mmlspark:0.13
50+
microsoft/mmlspark:0.14
5151
```
5252

5353
In this example, backslashes are used to break things up for readability; you
@@ -59,7 +59,7 @@ path and line breaks looks a little different:
5959
-e ACCEPT_EULA=y `
6060
-p 127.0.0.1:80:8888 `
6161
-v C:\myfiles:/notebooks/myfiles `
62-
microsoft/mmlspark:0.13
62+
microsoft/mmlspark:0.14
6363
```
6464

6565
Let's break this command and go over the meaning of each part:
@@ -143,7 +143,7 @@ Let's break this command and go over the meaning of each part:
143143
model.write().overwrite().save('myfiles/myTrainedModel.mml')
144144
```
145145

146-
* **`microsoft/mmlspark:0.13`**
146+
* **`microsoft/mmlspark:0.14`**
147147

148148
Finally, this specifies an explicit version tag for the image that we want to
149149
run.

docs/gpu-setup.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ to check availability in your data center.
2626
MMLSpark provides an Azure Resource Manager (ARM) template to create a
2727
default setup that includes an HDInsight cluster and a GPU machine for
2828
training. The template can be found here:
29-
<https://mmlspark.azureedge.net/buildartifacts/0.13/deploy-main-template.json>.
29+
<https://mmlspark.azureedge.net/buildartifacts/0.14/deploy-main-template.json>.
3030

3131
It has the following parameters that configure the HDI Spark cluster and
3232
the associated GPU VM:
@@ -69,7 +69,7 @@ GPU VM setup template at experimentation time.
6969
### 1. Deploy an ARM template within the [Azure Portal](https://ms.portal.azure.com/)
7070

7171
[Click here to open the above main
72-
template](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fmmlspark.azureedge.net%2Fbuildartifacts%2F0.13%2Fdeploy-main-template.json)
72+
template](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fmmlspark.azureedge.net%2Fbuildartifacts%2F0.14%2Fdeploy-main-template.json)
7373
in the Azure portal.
7474

7575
(If needed, you click the **Edit template** button to view and edit the

0 commit comments

Comments
 (0)