@@ -7,32 +7,32 @@ In order to build this package, you need to build and install `cook jobclient` f
7
7
git clone https://github.com/twosigma/Cook.git
8
8
cd Cook/jobclient
9
9
mvn package
10
- mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file -Dfile=target/cook-jobclient-0.1.0.jar -DpomFile=pom.xml
10
+ mvn org.apache.maven.plugins:maven-install-plugin:2.5.2:install-file \
11
+ -Dfile=target/cook-jobclient-0.1.2-snapshot.jar \
12
+ -DpomFile=pom.xml
11
13
```
12
14
13
- Now, we are ready to build the Spark distribution as follows.
14
-
15
+ Now, we are ready to build the Spark distribution as follows. Note that if you are using Java 7, we
16
+ probably need to increase heap size used by Maven a little bit. However, if you are on Java 8, you
17
+ could ignore the following step.
15
18
```
16
- # Install package to local m2 repository
17
- build/mvn install -DskipTests=true -Dscala-2.11 -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.4.4jco
18
-
19
- # Build jar for release without hive support
20
- ./make-distribution.sh --tgz --skip-java-test --scala-version 2.11 -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.4.4jco
21
-
22
- # Build jar for release with hive support
23
- ./make-distribution.sh --tgz --skip-java-test --scala-version 2.11 -Phive -Phive-thriftserver -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.4.4jco
19
+ export MAVEN_OPTS="-Xmx4g -XX:MaxPermSize=1024M -XX:ReservedCodeCacheSize=1024m"
20
+ ```
21
+ Then, we could
22
+ ```
23
+ ./dev/make-distribution.sh --tgz --name hadoop-provided-scala2.11 -Dscala-2.11 -Phadoop-2.6,hadoop-provided,hive -DskipTests
24
24
```
25
25
26
26
The tarball will be created with the hadoop version and scala version
27
27
embedded in the tarball name. Additionally, we use `git describe
28
28
--tags` to create the spark version, rather than just taking what's in
29
- the pom.xml files. This way, we get a tarball name that looks like
29
+ the pom.xml files. This way, we get a tarball name that looks like
30
30
31
- spark-1.6.1 -31-g9dc4df0-bin-hadoop2.6.0-cdh5.4.4jco -scala2.10 .tgz
31
+ spark-2.0.2 -31-g9dc4df0-bin-hadoop-provided -scala2.11 .tgz
32
32
33
33
rather than
34
34
35
- spark-1.6.1 -bin-2.6.0-cdh5.4.4jco .tgz
35
+ spark-2.0.2 -bin-hadoop-provided-scala2.11 .tgz
36
36
37
37
and thus we can manage multiple internal releases on the same upstream
38
38
version, and also manage our scala version dependencies appropriately.
0 commit comments