Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docu2 #19

Closed
wants to merge 39 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
d096eb2
Fix RFCs link
Sep 1, 2021
d2e1586
Update RFCs and fix link
Sep 1, 2021
ae93b11
Make homepage feature headers links
Sep 1, 2021
b431315
Fix homepage links
Sep 1, 2021
1a75764
Update homepage feature svgs
Sep 1, 2021
3651614
Add homepage feature svgs
Sep 1, 2021
ff63056
Add About Us page
Sep 1, 2021
ac86fc2
Update About Us, and fix links
Sep 1, 2021
fc53923
Fix footer links
Sep 1, 2021
e4fd327
Resolve merge conflicts in codecs.md
Licenser Sep 2, 2021
928f587
Update connectivity.md
Sep 2, 2021
72e3416
Update install.md
Sep 2, 2021
5e0e444
Update scripting.md
Sep 2, 2021
9628ecb
Update specialize.md
Sep 2, 2021
33aafa4
Update header in rfc template
Licenser Sep 2, 2021
243a298
Add title for rfc template
Licenser Sep 2, 2021
28f76d1
Add h1 header to RFC file
Sep 2, 2021
ed97e11
Delete duplicate files
Sep 3, 2021
e91ac76
Organise page files
Sep 3, 2021
11906ce
Update codecs.md
Sep 3, 2021
153521c
Update connectivity.md
Sep 3, 2021
1027a22
Update connectivity.md
Sep 3, 2021
f6d41aa
Update getting-started.md
Sep 3, 2021
7770bb6
Update getting-started.md
Sep 3, 2021
7763c99
Update install.md
Sep 3, 2021
e4691f8
Update scripting.md
Sep 3, 2021
8d36ca7
Update scripting.md
Sep 3, 2021
50a64b5
Update specialize.md
Sep 3, 2021
13b1764
Update 0002-pipeline-state-mechanism.md
Sep 3, 2021
b9c3292
Update 0003-linked-transports.md
Sep 3, 2021
370a648
Update 0004-sliding-window-mechanism.md
Sep 3, 2021
b3f8868
Update 0005-circuit-breaker-mechanism.md
Sep 3, 2021
ef8de65
Update 0006-plugin-development-kit.md
Sep 3, 2021
27e831e
Update 0007-pipeline-optimizations.md
Sep 3, 2021
cc4f33a
Update 0008-onramp-postgres.md
Sep 3, 2021
d93fd69
Update 0009-ramp-interface.md
Sep 3, 2021
5fb5a5b
Update 0010-modularity.md
Sep 3, 2021
5bfcac1
Update 0011-string-interpolation.md
Sep 3, 2021
1d72d26
Update 0012-correlation.md
Sep 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix homepage links
Signed-off-by: skoech <sharonkoech5147@gmail.com>
Signed-off-by: Heinz N. Gies <heinz@licenser.net>
  • Loading branch information
skoech authored and Licenser committed Sep 2, 2021
commit b4313153137dd4401e2c5f2f1ba23f062bddeedd
75 changes: 75 additions & 0 deletions src/pages/codecs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: Codecs
description: Understanding data. De- and encode data from wire formats.
hide_table_of_contents: false
---

### Concept

Tremor connects to the external systems using connectors.

Connectors that integrate Tremor with upstream systems from where Tremor is typically ingesting data are called `Onramps`.

Connectors that integrate Tremor with downstream systems where Tremor is typically publishing or contributing data to are called `Offramps`.

`Onramps` and `Offramps` use `codecs` to transform the external wire form of connected system participants into a structured internal value type Tremor understands semantically.

Tremor's internal type system is JSON-like.

`Onramps` and `Offramps` support `preprocessors` and `postprocessors`. External data ingested into Tremor via `Onramps` can be pre-processed through multiple transfomers before a code is applied to convert the data into Tremor-internal form. Preprocessors are configured as a chain of transformations. Postprocessors
are applied to values leaving Tremor after a codec transforms them from Tremor internal form to wire form. Postprocessors are configured as a chain of transformations.

Codecs share similar concepts to [extractors](https://docs.tremor.rs/tremor-script/#extractors), but differ in their application. Codecs are applied to external data as they are ingested by or egressed from a running Tremor process.
Extractors, on the other hand, are Tremor-internal and convert data from and to Tremor's internal value type.

### Data Format

Tremor's internal data representation is JSON-like. The supported value types are:

* String- UTF-8 encoded
* Numeric (float, integer)
* Boolean
* Null
* Array
* Record (string keys)

### Codecs

Tremor neither requires nor validates schemas and works with schemaless or unstructured data. Validation can be asserted with the tremor-script language. `Onramps`, `Offramps`, and other components of Tremor may, however, require or expect conformance with schemas.

For specific components, their documentation should be consulted for correct usage.

Tremor supports the encoding and decoding of the following formats:

* [json](https://docs.tremor.rs/artefacts/codecs#json)
* [msgpack](https://docs.tremor.rs/artefacts/codecs#msgpack)
* [influx](https://docs.tremor.rs/artefacts/codecs#influx)
* [binflux](https://docs.tremor.rs/artefacts/codecs#binflux)- (binary representation of the influx wire protocol).
* [statsd](https://docs.tremor.rs/artefacts/codecs#statsd)
* [yaml](https://docs.tremor.rs/artefacts/codecs#yaml)
* [string](https://docs.tremor.rs/artefacts/codecs#string)- any valid UTF-8 string sequence.

<h3 class="section-head" id="h-concept"><a href="#h-codecs"></a>Pre- and Postprocessors</h3>

Tremor supports the following preprocessing transformations in `Onramp` configurations:

* [lines](https://docs.tremor.rs/artefacts/preprocessors/#lines)- split by newline.
* [lines-null](https://docs.tremor.rs/artefacts/preprocessors/#lines-null)- split by null byte.
* [lines-pipe](https://docs.tremor.rs/artefacts/preprocessors/#lines-pipe)- split by `|`.
* [base64](https://docs.tremor.rs/artefacts/preprocessors/#base64)- base64 decoding.
* [decompress](https://docs.tremor.rs/artefacts/preprocessors/#decompress)- auto detecting decompress.
* [gzip](https://docs.tremor.rs/artefacts/preprocessors/#gzip)- gzip decompress.
* [zlib](https://docs.tremor.rs/artefacts/preprocessors/#zlib)- zlib decompress.
* [xz](https://docs.tremor.rs/artefacts/preprocessors/#xz)- xz decompress.
* [snappy](https://docs.tremor.rs/artefacts/preprocessors/#snappy)- snappy decompress.
* [lz4](https://docs.tremor.rs/artefacts/preprocessors/#lz4)- zl4 decompress.
* [gelf-chunking](https://docs.tremor.rs/artefacts/preprocessors/#gelf-chunking)- GELF chunking support.
* [remove-empty](https://docs.tremor.rs/artefacts/preprocessors/#remove-empty)- remove emtpy (0 len) messages.
* [length-prefixerd](https://docs.tremor.rs/artefacts/preprocessors/#length-prefixerd)- length prefixed splitting for streams.

Tremor supports the following postprocessing transformations in `Offramp` configurations:

* [lines](https://docs.tremor.rs/artefacts/postprocessors/#lines)
* [base64](https://docs.tremor.rs/artefacts/postprocessors/#base64)
* [length-prefixerd](https://docs.tremor.rs/artefacts/postprocessors/#length-prefixerd)
* [compression](https://docs.tremor.rs/artefacts/postprocessors/#compression)
56 changes: 56 additions & 0 deletions src/pages/connectivity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
title: Connectivity
description: Talking to other systems- Connecting different systems is an integral part of Tremor.
hide_table_of_contents: false
---

### Concept

In order to provide a general-purpose event processing facility to a broad base
of applications, Tremor separates processing from connectivity and distribution.

Tremor further separates the syntax of external formats from the implied value type semantics that are useful for filtering, processing, transforming, aggregating or otherwise deriving synthetic events from streams of data ingested by Tremor processes.

As Tremor is primarily an event-processing system, we refer to connections to external systems that are logically upstream of Tremor as [`Onramps`](#h-onramps).

We refer to connections to external systems that are logically downstream of Tremor as [`Offramps`](#h-offramps).

For example, the Kafka onramp subscribes to topics in a Kafka cluster and consumes event data from those topics; the Kafka offramp publishes to topics in a Kafka cluster and contributes event data to topics.

Application logic in Tremor can be connected to multiple onramps and/or offramps originating from or contributing to disparate systems. A simple passthrough application could enable bridging a Kafka system with websockets. It could preserve or transform the external wire-form. It could filter and partition event data using content based routing.

The application logic is always based on Tremor internal representation of the data, never on the external wire-format. This is by design.

Tremor has built-in support for metrics capture of data ingested and distributed (metrics) and also for communicating back-pressure events to application logic so that quality-of-service can be tuned adaptively.

### Onramps

Tremor supports a number of stable general purpose onramps:

* [Kafka](https://docs.tremor.rs/artefacts/onramps/#kafka)
* [TCP](https://docs.tremor.rs/artefacts/onramps/#TCP)
* [UDP](https://docs.tremor.rs/artefacts/onramps/#udp)
* [WS](https://docs.tremor.rs/artefacts/onramps/#WS)
* [File](https://docs.tremor.rs/artefacts/onramps/#File)- reads a singular file.
* [Metronome](https://docs.tremor.rs/artefacts/onramps/#metronome)- periodic tick events.
* [Crononome](https://docs.tremor.rs/artefacts/onramps/#crononome)- cron based tick events.
* [Blaster](https://docs.tremor.rs/artefacts/onramps/#blaster)- Benchmarking onramp.

And some early-access evolving production-grade onramps:

* [REST](https://docs.tremor.rs/artefacts/onramps/#REST)

### Offramps

Tremor supports a number of stable general purpose offramps:

* [File](https://docs.tremor.rs/artefacts/offramps/#File)
* [Kafka](https://docs.tremor.rs/artefacts/offramps/#Kafka)
* [REST](https://docs.tremor.rs/artefacts/offramps/#REST)
* [TCP](https://docs.tremor.rs/artefacts/offramps/#TCP)
* [UDP](https://docs.tremor.rs/artefacts/offramps/#UDP)
* [WS](https://docs.tremor.rs/artefacts/offramps/#WS)
* [BlackHole](https://docs.tremor.rs/artefacts/offramps/#REST)- benchmarking offramp.
* [elastic](https://docs.tremor.rs/artefacts/offramps/#elastic)- ElasticSearch client
* [debug](https://docs.tremor.rs/artefacts/offramps/#REST)- tremor internal use for debugging.
* [stdout](https://docs.tremor.rs/artefacts/offramps/#stdout)
109 changes: 109 additions & 0 deletions src/pages/install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
title: Quick Developer Install
description: Notes about Tremor installation for developers.
hide_table_of_contents: false
---

<h3 class="section-head" id="h-platforms"><a href="#h-platforms"></a>Supported Platforms</h3>

Select the operating system your are developing on.

<nav class="tabs" data-component="tabs">
<ul>
<li class="active">
<a href="#os-macosx">Mac OS X</a>
</li>
<li>
<a href="#os-linux">Linux</a>
</li>
<li>
<a href="#os-windows">Windows</a>
</li>
</ul>
</nav>

<div id="os-macosx">
<table class="bordered striped">
<tr><th class="w20">Type</th><th>Is Supported?</th></tr>
<tr><td>IDE Support</td><td>Yes. <a href="https://macvim-dev.github.io/macvim/">Macvim</a> or <a href="https://code.visualstudio.com">Visual Studio Code</a></td></tr>
<tr><td>Development</td><td>Yes</td></tr>
<tr><td>Production</td><td>No</td></tr>
</table>
</div>

<div id="os-linux">
<table class="bordered striped">
<tr><th class="w20">Type</th><th>Is Supported?</th></tr>
<tr><td>IDE Support</td><td>Yes. Vim ( out of the box ) or <a href="https://code.visualstudio.com">Visual Studio Code</a></td></tr>
<tr><td>Development</td><td>Yes</td></tr>
<tr><td>Production</td><td>Yes</td></tr>
</table>
</div>

<div id="os-windows">
<table class="bordered striped">
<tr><th class="w20">Type</th><th>Is Supported?</th></tr>
<tr><td>IDE Support</td><td>Yes</td></tr>
<tr><td>Development</td><td>Yes</td></tr>
<tr><td>Production</td><td>Accepting contributions</td></tr>
</table>
</div>

<h3 class="section-head" id="h-ide"><a href="#h-ide"></a>Setup an IDE / editor</h3>
<nav class="tabs" data-component="tabs">
<ul>
<li class="active">
<a href="#ide-vim">VIM</a>
</li>
<li>
<a href="#ide-vscode">Visual Studio Code</a>
</li>
<li>
<a href="#ide-other">Other</a>
</li>
</ul>
</nav>


<div id="ide-vim">
Follow the instructions in the <a href="https://github.com/tremor-rs/tremor-vim">tremor-vim</a> Git repository;
ensure your `.vimrc` is updated, and that you have the <a href="https://github.com/dense-analysis/ale">vim ALE</a> asynchronous
lint engine.

<pre>
cd $HOME/.vim/bundle
git clone https://github.com/tremor-rs/tremor-vim.git
</pre>

</div>

<div id="ide-vscode">
Follow the instructions in the <a href="https://github.com/tremor-rs/tremor-vscode">tremor-vscode</a> Git repository.
</div>

<div id="ide-other">
We are accepting contributions to support other IDEs.
</div>

<h3 class="section-head" id="h-trill"><a href="#h-trill"></a>Set Up the Tremor Language Server</h3>

Clone the tremor-langauge-server Git repository; build and install the server, and place the binary on your path.

> ```bash
> cd $HOME/git
> git clone https://github.com/tremor-rs/tremor-language-server.git
> cd tremor-language-server
> cargo build --release
> export PATH=`pwd`/target/release/tremor-language-server:$PATH
> ```

<h3 class="section-head" id="h-runtime"><a href="#h-runtime"></a>Setup the tremor runtime</h3>

> ```bash
> cd $HOME/git
> git clone https://github.com/tremor-rs/tremor-runtime.git
> cd tremor-runtime
> cargo build --release --all # go get a nice cup of tea
> ```

For more details on building Tremor, please refer to the [Tremor development docs](https://docs.tremor.rs/development/quick-start/).
128 changes: 128 additions & 0 deletions src/pages/scripting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
---
title: Scripting
description: Tremor applications- Tremor's application logic is scriptable.
hide_table_of_contents: false
---

### Concept

Tremor supports data processing through a directed acyclic graph-based pipeline or workflow. Pipelines can be configured via a YAML syntax or via a structured query language.

Pipelines are a graph of operations through which events are routed depth-first.
Operations in Tremor pipelines are pluggable and extensible.

For applications or algorithms that process one event at a time, such as data cleansing, enrichment, normalisation, validation and transformation, an ETL-focused scripting language can be used to program the application logic.

Qualities of service such as batching, bucketing and flushing semantics can be configured into pipelines and data shared between operators through metadata exposed to the scripting language.

The Tremor query language replaces the YAML pipeline format with a more intuitive and easier-to-program SQL-like language. The query language adds support for processing windows of events over time to support near-real-time grouping and aggregation.

For applications or algorithms that process events over time, such as those calculating summary statistics, aggregating or projecting alternate views or other complex data processing and routing logic, then Tremor query language is a better fit.

The query language embeds the scripting language, allowing data-flow or query-oriented logic to co-exist with ETL-oriented logic.

Both the query and scripting language are evolving as Tremor is applied to broader production use cases.

### Tremor Script

The scripting language supports JSON-like values. A valid JSON value is a valid tremor-script value.

Tremor Script adds an expression language that supports unary, binary, comparison and predicate operations with higher-level expressions supporting `match` expressions, `for` comprehensions and `patch` and `merge` expressions.

Features relatively unique to tremor-script are structural pattern matching and the recognition of and ability to extract data from microformats typically embedded in event data.

[Structural pattern matching](https://docs.tremor.rs/tremor-script/#match) allows patterns over arbitrarily nested values to be concisely declared with an intuitive syntax.

[Micro-format Extractors](https://docs.tremor.rs/tremor-script/#extractors) allows embedded data conforming to orthogonal formats such as regular expressions in Strings, date/time variants to be conditionally transformed to Tremor internal form and for embedded data to be extracted upon matching.

```tremor
define script extract # define the script that parses our Apache logs
script
match {"raw": event} of # we use the dissect extractor to parse the Apache log
case r = %{ raw ~= dissect|%{ip} %{} %{} [%{timestamp}] "%{method} %{path} %{proto}" %{code:int} %{cost:int}\\n| }
=> r.raw # this first case is hit if the log includes an execution time (cost) for the request
case r = %{ raw ~= dissect|%{ip} %{} %{} [%{timestamp}] "%{method} %{path} %{proto}" %{code:int} %{}\\n| }
=> r.raw # the second case is hit if the log does not includes an execution time (cost) for the request
default => emit => "bad"
end
end;
```

The full documentation [of the language](https://docs.tremor.rs/tremor-script) and its [standard library](https://docs.tremor.rs/tremor-script/functions) can be found in the [Docs](https://docs.tremor.rs).

### Tremor Query

Tremor Query builds around [Tremor Script](#h-script), and extends Tremor's capability to not only define scripts but also turn pipeline configuration into development rather then YAML wrestling. In addition to describing pipelines, Tremor Query adds the ability to group and aggregate events.

<nav class="tabs" data-component="tabs">
<ul>
<li class="active">
<a href="#before">Before (YAML)</a>
</li>
<li>
<a href="#after">After (Tremor Script)</a>
</li>
<li>
<a href="#logstash">Logstash</a>
</li>
</ul>
</nav>

<div id="before">

The YAML-based decription is unwieldy and easy to get wrong.

```yaml
pipeline:
- id: main
interface:
inputs:
- in
outputs:
- out
nodes:
- id: runtime
op: runtime::tremor
config:
script: |
match {"message": event} of
case r = %{ message ~= grok|%{IPORHOST:clientip}·%{USER:ident}·%{USER:auth}·[%{HTTPDATE:timestamp}]·"%{WORD:verb}·%{DATA:request}·HTTP/%{NUMBER:httpversion}"·%{NUMBER:response:int}·(?:-\|%{NUMBER:bytes:int})·%{QS:referrer}·%{QS:agent}| } => r.message
default => drop
end
links:
in: [runtime]
runtime: [out]
```

</div>

<div id="after">

In trickle script, the configuration becomes a query description based on a `select` statement to transform the data, and a `having` clause to filter events we do not wish to keep.

```trickle
select
match {"message": event} of
case r = %{ message ~= grok|%{IPORHOST:clientip}·%{USER:ident}·%{USER:auth}·[%{HTTPDATE:timestamp}]·"%{WORD:verb}·%{DATA:request}·HTTP/%{NUMBER:httpversion}"·%{NUMBER:response:int}·(?:-\|%{NUMBER:bytes:int})·%{QS:referrer}·%{QS:agent}| } => r.message
default => null
from in into out
having event != null
```

</div>

<div id="logstash">

```logstash
filter {
grok {
match => {
"message" => '%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}'
}
}
}
```

</div>

The full documentation [of the language](https://docs.tremor.rs/tremor-query), the [special operators](https://docs.tremor.rs/artefacts/operators), and [aggregation functions](https://docs.tremor.rs/tremor-query/functions) can be found in the [docs](https://docs.tremor.rs) .
27 changes: 27 additions & 0 deletions src/pages/specialize.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: Operators
description: Operators specialise Tremor pipelines; allow for highly custom behaviour.
hide_table_of_contents: false
---

### Concept

Some behaviour is either so performance critical, or so specialised that it can't or shouldn't be expressed using [Tremor Script](https://tremor.rs/getting-started/scripting/#h-script).

The solution to this is custom operators. Unlike Tremor Script that is interpreted at run time, they are written in [Rust](https://rust-lang.org), and can take advantage of the Rust ecosystem and natively compiled performance.

### Operators

Currently Tremor supports the following operators:

* [runtime::tremor](https://docs.tremor.rs/artefacts/operators#runtimetremor)
* [grouper::bucket](https://docs.tremor.rs/artefacts/operators#grouperbucket)
* [generic::backpressure](https://docs.tremor.rs/artefacts/operators#generic::backpressure)
* [generic::batch](https://docs.tremor.rs/artefacts/operators#generic::batch)

Some special operators also exist:

* [passthrough](https://docs.tremor.rs/artefacts/operators#passthrough)- internal use.
* [debug::history](https://docs.tremor.rs/artefacts/operators#debughistory)- development.

Additional information can be found in the [docs](https://docs.tremor.rs/artefacts/).
Loading