Skip to content

Update 0.5 Reference Docs #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 20, 2024
Merged

Update 0.5 Reference Docs #43

merged 6 commits into from
Aug 20, 2024

Conversation

mbroecheler
Copy link
Contributor

@mbroecheler mbroecheler commented Aug 18, 2024

  • Update cli
  • Add logger, log
  • Add tests
  • Add connectors
  • Add deployment profiles

The `primary-key` specifies the column or list of columns that uniquely identifies a single record in the table. Note, that when the table is a changelog or CDC stream for an entity table, the primary key should uniquely identify each record in the stream and not the underlying table. For example, if you consume a CDC stream for a `Customer` entity table with primary key `customerid` the primary key for the resulting CDC stream should include the timestamp of the change, e.g. `[customerid, lastUpdated]`.

The `timestamp` field specifies the (single) timestamp column for a source stream which has the event time of a single stream record. `watermark-millis` defines the number of milliseconds that events/records can arrive late for consistent processing. Set this to `1` if events are perfectly ordered in time and to `0` if the timestamp is monotonically increasing (i.e. it's perfectly ordered and no two events have the same timestamp). <br />
Alternatively, you can also use processing time for event processing by removing the `watermark-millis` field and adding the processing time as metadata (see below), which means using the system clock of the machine processing the data and not the timestamp of the record. We highly recommend you use event time and not processing time for consistent, reproducible results. <br />
Copy link
Contributor

@henneberger henneberger Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<br /> ?


### Connector Configuration

The connector configuration specifies how the stream engine connects to the source or sink and how it reads or writes the data. The connector configuration is specific to the configured stream processing engine that DataSQRL compiles to and the section of the configuration is named after the engine. In the example above, the connector configuration is for the `flink` engine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really needs a comma after the 'to', hard to read when it ends with a preposition.

The connector configuration is specific to the configured stream processing engine that DataSQRL compiles to, and the section of the configuration is named after the engine.

@mbroecheler mbroecheler marked this pull request as ready for review August 20, 2024 22:37
@mbroecheler mbroecheler merged commit 31e54ac into main Aug 20, 2024
2 checks passed
@mbroecheler mbroecheler deleted the refdocs branch August 20, 2024 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants