Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-7357] Introduce generic StorageConfiguration #10586

Merged
merged 2 commits into from
Feb 14, 2024

Conversation

yihua
Copy link
Contributor

@yihua yihua commented Jan 30, 2024

Change Logs

This PR introduces the generic StorageConfiguration to store configuration for I/O with HoodieStorage. Given there's overhead of reinitializing Hadoop's Configuration instance, the approach is to wrap the instance in the HadoopStorageConfiguration implementation. This change will enable us to remove our dependency on Hadoop's Configuration class. When integrated, places using Configuration will be replaced by StorageConfiguration and the StorageConfiguration will be passed around for instantiating HoodieStorage (unless Hadoop-based readers need the Configuration instance).

This is part of the effort to provide Hudi storage abstraction and decouple hudi-common from hadoop dependencies. For reference, the single big-change PR can be found here: #10360.

Impact

No impact give this PR does not have the integration.

Risk level

none

Documentation Update

N/A

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@vinothchandar vinothchandar self-assigned this Jan 30, 2024
@apache apache deleted a comment from hudi-bot Jan 30, 2024
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Minor comments/cautionary callouts

@yihua yihua force-pushed the HUDI-7357-introduce-generic-conf branch from e6a99b7 to ce36bbb Compare February 14, 2024 00:57
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua
Copy link
Contributor Author

yihua commented Feb 14, 2024

CI is green.
Screenshot 2024-02-13 at 19 41 37

@yihua yihua merged commit 1f7e0f6 into apache:master Feb 14, 2024
31 checks passed
yihua added a commit that referenced this pull request Feb 27, 2024
This commit introduces the generic `StorageConfiguration` to store configuration for I/O with `HoodieStorage`. Given there's overhead of reinitializing Hadoop's `Configuration` instance, the approach is to wrap the instance in the `HadoopStorageConfiguration` implementation.  This change will enable us to remove our dependency on Hadoop's `Configuration` class.  When integrated, places using `Configuration` will be replaced by `StorageConfiguration` and the `StorageConfiguration` will be passed around for instantiating `HoodieStorage` (unless Hadoop-based readers need the `Configuration` instance).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

4 participants