Skip to content

stefanvanwouw/puppet-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Puppet module for Spark (0.9.0)

Puppet module to install Spark (0.9.0) on your Hadoop cluster.

Unfortunately no Debian packages are available for Spark, and the pre-compiled Spark versions are not compatible with CDH 4.4.0. Therefore I built the Spark incubator version 0.9.0 and included the entire dist directory in the puppet module.

If you want to deploy another version of Spark use the following code to compile (e.g. older Spark 0.8.0):

wget https://github.com/apache/incubator-spark/archive/v0.8.0-incubating.tar.gz
tar xvf v0.8.0-incubating.tar.gz
cd incubator-spark-0.8.0-incubating/
./make-distribution.sh --hadoop 2.0.0-cdh4.4.0
cp conf/log4j.properties.template dist/conf/log4j.properties

# Replace the standard distribution with the one you just compiled:
rm -rf /etc/puppet/modules/spark/files/spark
cp -r dist /etc/puppet/modules/spark/files/spark

Note: Spark 0.8.0 does not compile with YARN enabled against YARN CDH4.4.0.

Dependencies not made explicit in the module itself:

  • Oracle Java 6 (7 for Spark 0.9.0+) installed on all nodes (requirement of Spark).
  • Apache HDFS should be installed (The CDH4 versions included in: https://github.com/wikimedia/puppet-cdh4 ).
  • OS should be Ubuntu/Debian for package dependencies.

Usage:

On the master node:

class {'spark::master':
    worker_mem => 'worker memory e.g. 60g',
    require => [
        Class['your::class::that::ensures::java::is::installed'], 
        Class['cdh4::hadoop']
    ],
}

On the worker nodes:

class {'spark::worker':
    master => $master_fqdn,
    memory => 'worker memory e.g. 60g',
    require => [
        Class['your::class::that::ensures::java::is::installed'], 
        Class['cdh4::hadoop']
    ],
}

About

Puppet module to install Spark (0.9.0)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published