Skip to content

Commit 2d02d9f

Browse files
author
Mark Hatton
committed
Add brief tutorial to README.md
1 parent 2dfc07f commit 2d02d9f

File tree

2 files changed

+65
-6
lines changed

2 files changed

+65
-6
lines changed

README.md

Lines changed: 60 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,61 @@
1-
AWS DataPipeline DSL for Scala
2-
==============================
1+
# AWS DataPipeline DSL for Scala
32

4-
More here soon...
3+
A Scala domain-specific language and toolkit to help you build and maintain AWS DataPipeline definitions.
4+
5+
This tool aims to ease the burden of maintaining a large suite of AWS DataPipelines. At Shazam, we use this tool to
6+
define our data pipelines in Scala code and avoid the boilerplate and maintenance headache of managing 10s or 100s of
7+
JSON pipeline configuration files.
8+
9+
Benefits:-
10+
- Write and maintain Scala code instead of JSON configuration
11+
- Use the DSL's `>>` syntax to clearly express dependencies between your pipeline's activities
12+
- Share code/configuration between your pipeline definitions
13+
- Never write `dependsOn` or `precondition` again, this library manages all ids and object references for you
14+
- Add your own wrapper around this library to predefine most your most commonly-used data pipeline objects
15+
16+
## Tutorial
17+
18+
Build the compiler using `sbt`:
19+
```
20+
$ sbt assembly
21+
```
22+
23+
Create a "Hello World" AWS Data Pipeline definition Scala file:
24+
```
25+
object HelloWorldPipeline {
26+
27+
import datapipeline.dsl._
28+
29+
val pipeline =
30+
AwsDataPipeline(name = "HelloWorldPipeline")
31+
.withSchedule(
32+
frequency = Daily,
33+
startDateTimeIso = "2018-01-01T00:00:00"
34+
)
35+
.withActivities(
36+
ShellCommandActivity(
37+
name = "Echo Hello World",
38+
workerGroup = "my-task-runner",
39+
Command("echo 'Hello AWS Data Pipeline World!'")
40+
)
41+
)
42+
43+
}
44+
```
45+
46+
Use the compiler to produce JSON from our Scala definition:
47+
48+
```
49+
$ java -jar target/scala-2.12/datapipeline-compiler.jar HelloWorldPipeline HelloWorldPipeline.scala
50+
Writing pipeline definition to: ./HelloWorldPipeline.json
51+
```
52+
53+
The output JSON file contains your pipeline definition ready to deploy to AWS.
54+
55+
## Supported AWS DataPipeline Objects
56+
57+
For details see [Supported Objects](Supported%20Objects.md).
58+
59+
## License
60+
61+
This tool is licensed under [APL 2.0].

src/main/scala/datapipeline/compiler/AwsDataPipelineCompiler.scala

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,12 +62,14 @@ object AwsDataPipelineCompilerHelpers {
6262

6363
if (!clazz.getDeclaredFields.map(_.getName).contains(PipelineField)) fail(
6464
s"""Error: The class $className does not have a field named '$PipelineField'.
65-
|Your pipeline definition singleton should include a field named 'pipeline' of type datapipeline.dsl.PipelineBuilder,
65+
|Your pipeline definition singleton should include a field named '$PipelineField' of type datapipeline.dsl.PipelineBuilder,
6666
|e.g.:
6767
|
68-
|object DataPipeline {
68+
|object MyDataPipeline {
6969
|
70-
| val pipeline = datapipeline.dsl.AwsDataPipeline(name = "my-pipeline", ...)
70+
| import datapipeline.dsl._
71+
|
72+
| val $PipelineField = AwsDataPipeline(name = "MyDataPipeline", ...)
7173
|
7274
|}
7375
""".stripMargin

0 commit comments

Comments
 (0)