Skip to content

[Proposal] Streaming execution support roadmap #4285

Open
@metesynnada

Description

@metesynnada

[Proposal] Streaming execution support roadmap

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Adding streaming support to Datafusion and executing queries continuously on unbounded datasets is a frequent topic of discussion. Streaming is also an item on the roadmap, discussed in Ballista #30, and a part of the general desiderata.

In the recent past, there have been some attempts and PoC implementations to explore how this could be done. Some examples are:

We would like to use this issue to coordinate a fresh re-think and a disciplined push toward achieving the streaming support goals and making progress on the roadmap.

Describe the solution you'd like
We have a proposal-stage roadmap that details how streaming support can be achieved as a sequence/collection of multiple tasks. You can find our proposal here

Within this proposal, you can find design discussions, code snippets, and individual task/issue descriptions paving the way for full support.

We have been experimenting with many different candidate approaches and worked on a few PoC implementations as we went through the design process. Still, this is a huge topic and we are sure there are certain subtleties, perspectives, and challenges we might have missed.

Looking forward to hearing the community’s thoughts on this proposal. Thanks!

Describe alternatives you've considered
We studied @hntd187 's valuable contributions on #1544 for Kafka provider.

Additional context
If this proposal is found to be a sensible path forward, we are happy to turn it into an epic GitHub issue and start tracking the progress through that. We are also happy to take on the implementation work of a significant number of the steps/tasks in this proposal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions