Skip to content

Building bigger platforms with scalding

P. Oscar Boykin edited this page Jun 25, 2013 · 9 revisions

Please add to this, only a sketch is here now.

You can implement methods on objects or on classes that do scalding computations.

Using the Typed-API outside of a Job constructor

This is the recommended approach because you can see the types going in and out, and the compiler can help you get it right.

Talk about import TDsl._ and taking (implicit flow: FlowDef, mode: Mode) in any function doing reading or writing of a source.

Using the Fields-API outside of a Job constructor

This is bit challenging because you have to be careful about what fields you leave in the Pipe and there is little help from the compiler.

Talk about import Dsl._ and taking (implicit flow: FlowDef, mode: Mode) in any function doing reading or writing of a source.

Customizing Job execution

Mention specialized Job examples (CascadeJob for instance).

Using scalding outside of a com.twitter.scalding.Job (or Tool)

Just do what you would with cascading:

        implicit val mode = Hdfs(new JobConf()) 
        implicit val flowDef = new FlowDef
        flowDef.setName(jobName)
        val result = myFunctionThatTakesFlowDefAndMode(flowDef, mode))
        // Now we have a populated flowDef, time to let Cascading do it's thing:
        mode.newFlowConnector(config).connect(flowDef).complete

Contents

Getting help

Documentation

Matrix API

Third Party Modules

Videos

How-tos

Tutorials

Articles

Other

Clone this wiki locally