-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[FLINK-31631][FileSystems] Upgrade GCS connector to 2.2.11. #22281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'd appreciate getting some guidance on anything more required in the licenses directory. According to the Licensing guide, there is supposed to be a license file per bundled dependency in the shaded jar, which would be ~70 files. However, the existing directory contains only 8 files. |
From a Flink POV, a bundle is considered the artifact that's being produced. In your case, you're only touching |
The CI fails on the licensing check, see https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=47622&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=d6e79740-7cf7-5407-2e69-ca34c9be0efb&l=25671 for details |
Thank you, @MartijnVisser . I've updated the pull request with additional NOTICE entries, and I've removed the previous changes in the licenses sub-directory. |
@cnauroth Do you think that should be included in the documentation? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, just wondering if some of the new features/changes should also be documented. Leaving it open for the final comments before merging it
Thanks, @MartijnVisser ! It looks like the established documentation follows a pattern of mentioning how to configure the connector, but not listing all available configuration properties. Instead, it links to the connector's documentation for the full list of properties. I think that's a good strategy, because it avoids extra maintenance for Flink if anything changes about the connector configuration. However, a potential improvement would be for the documentation to use version-specific links to the connector documentation. Right now, it links to the master branch. This could cause confusion, because the master branch might contain code with new configuration properties that haven't made it into an official release yet. I'd propose updating these 2 links to the specific 2.2.11 release (and the same for the content.zh files): https://github.com/apache/flink/blob/master/docs/content/docs/deployment/filesystems/gcs.md?plain=1#L58 LMK if you agree, and I can update the PR. |
That depends on the URL that you use. E.g. https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/gcs/ points to So I think you can safely include the improvements to the |
I pushed up the documentation changes. Thank you! |
@cnauroth Thanks for the PR, LGTM |
@MartijnVisser , thanks for the review and commit! |
there seems to be a conflict with guava with the upgraded gcs-sdk version.
|
@cnauroth Any thoughts on this? |
@yigress , thank you for reporting this. The stack trace appears to be in the checkpointing code path, which uses the GCS client directly (not the Hadoop The patch upgraded from google-cloud-storage 2.2.3 (transitive dependency to Guava 31.0.1-jre) to google-cloud-storage 2.15.0 (transitive dependency to Guava 31.1-jre). Both the GCS client and the Long-term, it would be great to find a way to re-route this checkpointing code path through the |
Hi @cnauroth - There are multiple code paths and API's in Flink which uses different google libraries and we should make them consistent. For instance
The ideal scenario will be that everything goes via the hadoop-gcs connector. |
Hello @singhravidutt . I heard that you might have a patch already in progress to fix this Guava dependency conflict. If so, would you please open a new JIRA issue and send the pull request? Thank you. |
There is now a patch available here at #23469. |
will it fix this issue? |
The |
What is the purpose of the change
Upgrade GCS connector to 2.2.11.
Brief change log
Upgrade the GCS Connector bundled in the Flink distro from version 2.2.3 to 2.2.11. The new release contains multiple bug fixes and enhancements discussed in the Release Notes. Notable changes include:
Verifying this change
This change is already covered by existing tests. These tests pass with the new version.
Additionally, I built a full distro and successfully ran a job that reads and writes to a GCS bucket.
The documentation changes were tested by starting up the local server and clicking through the updated links.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation