4
4
5
5
[ ![ Build Status] ( https://msazure.visualstudio.com/Cognitive%20Services/_apis/build/status/Azure.mmlspark?branchName=master )] ( https://msazure.visualstudio.com/Cognitive%20Services/_build/latest?definitionId=83120&branchName=master ) [ ![ codecov] ( https://codecov.io/gh/Azure/mmlspark/branch/master/graph/badge.svg )] ( https://codecov.io/gh/Azure/mmlspark ) [ ![ Gitter] ( https://badges.gitter.im/Microsoft/MMLSpark.svg )] ( https://gitter.im/Microsoft/MMLSpark?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge )
6
6
7
- [ ![ Release Notes] ( https://img.shields.io/badge/release-notes-blue )] ( https://github.com/Azure/mmlspark/releases ) [ ![ Release Notes] ( https://img.shields.io/badge/version-0.17 -blue )] ( https://github.com/Azure/mmlspark/releases ) [ ![ version] ( https://mmlspark.blob.core.windows.net/icons/badges/master_version3.svg )] ( #sbt )
7
+ [ ![ Release Notes] ( https://img.shields.io/badge/release-notes-blue )] ( https://github.com/Azure/mmlspark/releases ) [ ![ Release Notes] ( https://img.shields.io/badge/version-0.18.0 -blue )] ( https://github.com/Azure/mmlspark/releases ) [ ![ version] ( https://mmlspark.blob.core.windows.net/icons/badges/master_version3.svg )] ( #sbt )
8
8
9
9
10
10
MMLSpark is an ecosystem of tools aimed towards expanding the distributed computing framework
@@ -129,9 +129,9 @@ MMLSpark can be conveniently installed on existing Spark clusters via the
129
129
` --packages ` option, examples:
130
130
131
131
``` bash
132
- spark-shell --packages Azure:mmlspark :0.17
133
- pyspark --packages Azure:mmlspark :0.17
134
- spark-submit --packages Azure:mmlspark :0.17 MyApp.jar
132
+ spark-shell --packages com.microsoft.ml.spark:mmlspark_2.11 :0.18.0
133
+ pyspark --packages com.microsoft.ml.spark:mmlspark_2.11 :0.18.0
134
+ spark-submit --packages com.microsoft.ml.spark:mmlspark_2.11 :0.18.0 MyApp.jar
135
135
```
136
136
137
137
This can be used in other Spark contexts too. For example, you can use MMLSpark
@@ -146,14 +146,14 @@ cloud](http://community.cloud.databricks.com), create a new [library from Maven
146
146
coordinates] ( https://docs.databricks.com/user-guide/libraries.html#libraries-from-maven-pypi-or-spark-packages )
147
147
in your workspace.
148
148
149
- For the coordinates use: ` Azure:mmlspark :0.17 ` . Ensure this library is
149
+ For the coordinates use: ` com.microsoft.ml.spark:mmlspark_2.11 :0.18.0 ` . Ensure this library is
150
150
attached to all clusters you create.
151
151
152
152
Finally, ensure that your Spark cluster has at least Spark 2.1 and Scala 2.11.
153
153
154
154
You can use MMLSpark in both your Scala and PySpark notebooks. To get started with our example notebooks import the following databricks archive:
155
155
156
- ` https://mmlspark.blob.core.windows.net/dbcs/MMLSpark%20Examples%20v0.17 .dbc `
156
+ ` https://mmlspark.blob.core.windows.net/dbcs/MMLSpark%20Examples%20v0.18.0 .dbc `
157
157
158
158
### Docker
159
159
@@ -185,39 +185,20 @@ the above example, or from python:
185
185
``` python
186
186
import pyspark
187
187
spark = pyspark.sql.SparkSession.builder.appName(" MyApp" ) \
188
- .config(" spark.jars.packages" , " Azure:mmlspark :0.17 " ) \
188
+ .config(" spark.jars.packages" , " com.microsoft.ml.spark:mmlspark_2.11 :0.18.0 " ) \
189
189
.getOrCreate()
190
190
import mmlspark
191
191
```
192
192
193
193
<img title =" Script action submission " src =" http://i.imgur.com/oQcS0R2.png " align =" right " />
194
194
195
- ### HDInsight
196
-
197
- To install MMLSpark on an existing [ HDInsight Spark
198
- Cluster] ( https://docs.microsoft.com/en-us/azure/hdinsight/ ) , you can execute a
199
- script action on the cluster head and worker nodes. For instructions on
200
- running script actions, see [ this
201
- guide] ( https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-customize-cluster-linux#use-a-script-action-during-cluster-creation ) .
202
-
203
- The script action url is:
204
- < https://mmlspark.azureedge.net/buildartifacts/0.17/install-mmlspark.sh > .
205
-
206
- If you're using the Azure Portal to run the script action, go to `Script
207
- actions` → ` Submit new` in the ` Overview` section of your cluster blade. In
208
- the ` Bash script URI ` field, input the script action URL provided above. Mark
209
- the rest of the options as shown on the screenshot to the right.
210
-
211
- Submit, and the cluster should finish configuring within 10 minutes or so.
212
-
213
195
### SBT
214
196
215
197
If you are building a Spark application in Scala, add the following lines to
216
198
your ` build.sbt ` :
217
199
218
200
``` scala
219
- resolvers += " MMLSpark Repo" at " https://mmlspark.azureedge.net/maven"
220
- libraryDependencies += " com.microsoft.ml.spark" %% " mmlspark" % " 0.17"
201
+ libraryDependencies += " com.microsoft.ml.spark" %% " mmlspark" % " 0.18.0"
221
202
```
222
203
223
204
### Building from source
0 commit comments