Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Hudi DeltaStreamer compatible feature #8724

Closed
LittleWat opened this issue Oct 5, 2023 · 7 comments
Closed

Support Hudi DeltaStreamer compatible feature #8724

LittleWat opened this issue Oct 5, 2023 · 7 comments
Labels

Comments

@LittleWat
Copy link

Feature Request / Improvement

Hi! Currently, we are evaluating Iceberg & Hudi and both tools are great and provide similar features.

One thing we noticed that Hudi Deltastreamer makes it easy to ingest data and it would be great if Iceberg support similar feature.

Thank you!

Query engine

None

@amogh-jahagirdar
Copy link
Contributor

amogh-jahagirdar commented Oct 6, 2023

I'm personally not very familiar with Hudi Deltastreamer but for Kafka ingest into Iceberg tables, @bryanck is working on contributing a Kafka sink connector into the project #8701 check out this initial PR. Other than Kafka ingestion are there any other specific features you were looking for?

@LittleWat
Copy link
Author

LittleWat commented Oct 7, 2023

KafkaConnect is cool! Deltastreamer also supports the distributed-file-system ingestion. Using this, e.g. we can ingest the raw AVRO/CSV/JSON data in S3 to Hudi.

@pvary
Copy link
Contributor

pvary commented Oct 7, 2023

We use Flink for the ingestion.
Flink supports wide range of sources, and the Iceberg FlinkSink enables you to write them to Iceberg tables.

@LittleWat
Copy link
Author

Yes, Flink is great but still we need to write some code for ingestion, right..? Hudi Deltastreamer is a kind of NoCode solution and I think it will make it easier to ingest data.

@pvary
Copy link
Contributor

pvary commented Oct 9, 2023

Yes, Flink is great but still we need to write some code for ingestion, right..?

Yes, you need to write Flink the job's code. If your goal is just a simple dump, then it could be an overkill, but if you need to do any transformation, you can do it in your code easily

Copy link

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

@github-actions github-actions bot added the stale label Sep 22, 2024
Copy link

github-actions bot commented Oct 7, 2024

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants