Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connector migrations and redesign #512

Merged
merged 7 commits into from
Jan 26, 2024
Merged

Connector migrations and redesign #512

merged 7 commits into from
Jan 26, 2024

Conversation

mwylde
Copy link
Member

@mwylde mwylde commented Jan 26, 2024

This PR redesigns the connector system and migrates (almost all) connectors into the arrow world

Operator/connector redesign

Currently, all operator infra and the operators themselves are implemented in arroyo-worker. Meanwhile connector definitions, configs, and testing logic (which often duplicates code in the connector implementation) is in arroyo-connectors.
To instantiate operators/connectors, we have a giant match blog in engine.rs.

This was (sort of) necessary back in the macro days, but with traits we can be more flexible.

There are a few things that are undesirable about this situation:

  • Connector logic is split between arroyo-connectors and aroryo-workers and some stuff is duplicated (like the logic to construct a redis or kafka client).
  • Adding a connector requires changing core arroyo code, and can't for example be added in a separate crate
  • We have to maintain a map of all connectors/operators names in order to construct them

To address these issues this PR reorganizes the code as follows:

  • Adds a new crate, arroyo-operator that contains the common operator infra (ArrowOperator, SourceOperator, BaseOperator, etc.)
  • arroyo-connectors now depends on arroyo-operator, and all connector implementation has moved into that package
  • The ArrowOperatorConstructor was reworked to be able to be a trait object, which gives us a unified way to construct operators without needing the big match block

Longer term, we can look into using a crate like https://github.com/dtolnay/linkme to allow connectors to be registered at compile time, making it easy for advanced users to add custom connectors just be adding a new crate to the build.

Connector migrations

All connectors have now been migrated into arrow with the exception of the FileSystem/Delta sink and nexmark. These were excluded as they require significant work to migrate.

@mwylde mwylde enabled auto-merge (squash) January 26, 2024 20:57
@mwylde mwylde merged commit d2e6cea into dev Jan 26, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants