Support Hudi `DeltaStreamer` compatible feature #8724

LittleWat · 2023-10-05T12:51:30Z

Feature Request / Improvement

Hi! Currently, we are evaluating Iceberg & Hudi and both tools are great and provide similar features.

One thing we noticed that Hudi Deltastreamer makes it easy to ingest data and it would be great if Iceberg support similar feature.

Thank you!

Query engine

None

amogh-jahagirdar · 2023-10-06T06:43:33Z

I'm personally not very familiar with Hudi Deltastreamer but for Kafka ingest into Iceberg tables, @bryanck is working on contributing a Kafka sink connector into the project #8701 check out this initial PR. Other than Kafka ingestion are there any other specific features you were looking for?

LittleWat · 2023-10-07T02:05:29Z

KafkaConnect is cool! Deltastreamer also supports the distributed-file-system ingestion. Using this, e.g. we can ingest the raw AVRO/CSV/JSON data in S3 to Hudi.

pvary · 2023-10-07T06:07:43Z

We use Flink for the ingestion.
Flink supports wide range of sources, and the Iceberg FlinkSink enables you to write them to Iceberg tables.

LittleWat · 2023-10-07T09:27:10Z

Yes, Flink is great but still we need to write some code for ingestion, right..? Hudi Deltastreamer is a kind of NoCode solution and I think it will make it easier to ingest data.

pvary · 2023-10-09T11:16:08Z

Yes, Flink is great but still we need to write some code for ingestion, right..?

Yes, you need to write Flink the job's code. If your goal is just a simple dump, then it could be an overkill, but if you need to do any transformation, you can do it in your code easily

github-actions · 2024-09-22T00:16:26Z

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions · 2024-10-07T00:15:35Z

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

github-actions bot added the stale label Sep 22, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Hudi `DeltaStreamer` compatible feature #8724

Support Hudi `DeltaStreamer` compatible feature #8724

LittleWat commented Oct 5, 2023

amogh-jahagirdar commented Oct 6, 2023 •

edited

Loading

LittleWat commented Oct 7, 2023 •

edited

Loading

pvary commented Oct 7, 2023

LittleWat commented Oct 7, 2023

pvary commented Oct 9, 2023

github-actions bot commented Sep 22, 2024

github-actions bot commented Oct 7, 2024

Support Hudi DeltaStreamer compatible feature #8724

Support Hudi DeltaStreamer compatible feature #8724

Comments

LittleWat commented Oct 5, 2023

Feature Request / Improvement

Query engine

amogh-jahagirdar commented Oct 6, 2023 • edited Loading

LittleWat commented Oct 7, 2023 • edited Loading

pvary commented Oct 7, 2023

LittleWat commented Oct 7, 2023

pvary commented Oct 9, 2023

github-actions bot commented Sep 22, 2024

github-actions bot commented Oct 7, 2024

Support Hudi `DeltaStreamer` compatible feature #8724

Support Hudi `DeltaStreamer` compatible feature #8724

amogh-jahagirdar commented Oct 6, 2023 •

edited

Loading

LittleWat commented Oct 7, 2023 •

edited

Loading