Skip to content

Remove Bouncy Castle dependency usage from PemUtils #1318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 20, 2025

Conversation

exceptionfactory
Copy link
Contributor

This pull request removes use of the Bouncy Castle Provider library for the PemUtils class for parsing public and private key files.

The Bouncy Castle library includes a large number of cryptographic capabilities, some of which are duplicative of current capabilities in recent versions of Java. Parsing PEM-encoded files is the only direct use of Bouncy Castle in Polaris based on imported classes, so replacing this usage with an alternative solution removes the need for a significant direct dependency.

The PemUtils class already uses standard Java Security components for parsing decoded public keys and private key material, so the changes are scoped to replacing the initial PEM content parsing and Base64 decoding.

The changes include a new PemUtilsTest class that exercises parsing RSA public key and private key files.

The PemUtils class usage is limited to JWT asymmetric token support, narrowing the scope of changes to this specific token signing implementation.

adutra
adutra previously approved these changes Apr 7, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Apr 7, 2025
@@ -46,9 +45,38 @@ private static byte[] parsePEMFile(Path pemPath) throws IOException {
throw new FileNotFoundException(
String.format("The file '%s' doesn't exist.", pemPath.toAbsolutePath()));
}
try (PemReader reader = new PemReader(Files.newBufferedReader(pemPath, UTF_8))) {
PemObject pemObject = reader.readPemObject();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One general remark: the original readPemObject() method, in principle, only reads the first object in the PEM file:

https://downloads.bouncycastle.org/java/docs/bcprov-jdk18on-javadoc/org/bouncycastle/util/io/pem/PemReader.html#readPemObject()

With your changes, what happens when the PEM file contains more than one object? And what if it doesn't contain any? It might be good to add tests for these cases as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback and noting the behavior of readPemObject(). I adjusted the new implementation to stop reading when finding a footer line, and added unit tests for that scenario, as well as an empty file, which throws an IOException.

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LICENSE/NOTICE changes need to be corrected.

@@ -2050,12 +2050,6 @@ License (from POM): Apache License 2.0 - https://www.apache.org/licenses/LICENSE

--------------------------------------------------------------------------------

Group: org.bouncycastle Name: bcprov-jdk18on Version: 1.80
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bouncycastle/bc-prov is still a dependency of polaris-quarkus-server (see dependencies of the runtimeClasspath Gradle configuration), so the removals in LICENSE/NOTICE are wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noting the runtime transitive dependency on bcprov-jdk18on, I reverted the license and notice changes.

@github-project-automation github-project-automation bot moved this from Ready to merge to PRs In Progress in Basic Kanban Board Apr 7, 2025
@exceptionfactory
Copy link
Contributor Author

Thanks for the feedback @adutra and @snazy, I pushed updates to address your comments thus far.

dimas-b
dimas-b previously approved these changes Apr 7, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Thanks, @exceptionfactory !

@exceptionfactory
Copy link
Contributor Author

Thanks for the feedback @dimas-b!

Do you have any additional comments @snazy or @adutra following the updates?

@adutra
Copy link
Contributor

adutra commented Apr 18, 2025

I'm good here; @snazy do you have further remarks?

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License file concerns are addressed, so I'm okay.

- Added PEM format parsing in PemUtils
- Added unit test for PemUtils
- Removed Bouncy Castle Provider dependency from service common module
- Removed Bouncy Castle Provider dependency from quarkus service module
- Removed Bouncy Castle references from LICENSE and NOTICE files
@exceptionfactory exceptionfactory force-pushed the remove-bouncycastle-pem-1 branch from 76049b3 to 29525ec Compare May 20, 2025 13:51
@exceptionfactory
Copy link
Contributor Author

Thanks for the updated review @snazy. I force-pushed an update to resolve recent merge conflicts. Please let me know if there is anything else that needs to be addressed.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board May 20, 2025
@snazy snazy merged commit 5332e4f into apache:main May 20, 2025
6 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board May 20, 2025
@snazy
Copy link
Member

snazy commented May 20, 2025

Thanks @exceptionfactory

snazy added a commit to snazy/polaris that referenced this pull request May 22, 2025
* main: Update docker.io/prom/prometheus Docker tag to v3.4.0 (apache#1602)

* Site: Update production configuration page (apache#1606)

* main: Update dependency com.google.cloud:google-cloud-storage-bom to v2.52.3 (apache#1623)

* main: Update dependency boto3 to v1.38.19 (apache#1622)

* Remove Bouncy Castle dependency usage from PemUtils (apache#1318)

- Added PEM format parsing in PemUtils
- Added unit test for PemUtils for empty file and multiple PEM objects
- Removed Bouncy Castle Provider dependency from service common module
- Removed Bouncy Castle Provider dependency from quarkus service module

* Site: Add a page for policy management (apache#1600)

* [Policy Store | Management Spec] Add policy privileges to spec and update admin service impl (apache#1529)

This PR adds new policy related privileges to polaris-management-api.yml and update PolarisAdminService to allow granting new privileges

* Spec: Add SigV4 Auth Support for Catalog Federation (apache#1506)

* Spec changes for SigV4 Auth Support for Catalog Federation

* Extract service identity info as a nested object

* nit: fix admin tool log level and comments (apache#1626)

The previous WARNING log levels seems to work, but WARN
aligns better with standard Quarkus log levels.

Fixes apache#1612

* Doc: switch to use iceberg-aws-bundle jar (apache#1609)

* main: Update dependency org.mockito:mockito-core to v5.18.0 (apache#1630)

* main: Update dependency boto3 to v1.38.20 (apache#1631)

* Require explicit user-consent to enable HadoopFileIO (apache#1532)

Using `HadoopFileIO` in Polaris can enable "hidden features" that users are likely not aware of. This change requires users to manually update the configuration to be able to use `HadoopFileIO` in way that highlights the consequences of enabling it.

This PR updates Polaris in multiple ways:
* The default of `SUPPORTED_CATALOG_STORAGE_TYPES` is changed to not include the `FILE` storage type.
* Respect the `ALLOW_SPECIFYING_FILE_IO_IMPL` configuration on namespaces, tables and views to prevent setting an `io-impl` value for anything but one of the configured, supported storage-types.
* Unify validation code in a new class `IcebergPropertiesValidation`.
* Using `FILE` or `HadoopFileIO` now _also_ requires the explicit configuration `ALLOW_INSECURE_STORAGE_TYPES_ACCEPTING_SECURITY_RISKS=true`.
* Added production readiness checks that trigger when `ALLOW_INSECURE_STORAGE_TYPES_ACCEPTING_SECURITY_RISKS` is `true` or `SUPPORTED_CATALOG_STORAGE_TYPES` contains `FILE` (defaults and per-realm).
* The two new readiness checks are considered _severe_. Severe readiness-errors prevent the server from starting up - unless the user explicitly configured `polaris.readiness.ignore-security-issues=true`.

Log messages and configuration options explicitly use "clear" phrases highlighting the consequences.

With these changes it is intentionally extremely difficult to start Polaris with HadoopFileIO. People who work around all these safety nets must have realized that what they are doing.

A lot of the test code relies on `FILE`/`HadoopFileIO`, those tests got all the configurations to let those tests continue to work as they are, bypassing the added security safeguards.

---------

Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@dremio.com>

---------

Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Yufei Gu <yufei@apache.org>
Co-authored-by: David Handermann <exceptionfactory@apache.org>
Co-authored-by: Honah (Jonas) J. <honahx@apache.org>
Co-authored-by: Rulin Xing <xjdkcsq3@gmail.com>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@dremio.com>
Co-authored-by: MonkeyCanCode <yongzheng0809@gmail.com>
adnanhemani pushed a commit to adnanhemani/polaris that referenced this pull request May 22, 2025
- Added PEM format parsing in PemUtils
- Added unit test for PemUtils for empty file and multiple PEM objects
- Removed Bouncy Castle Provider dependency from service common module
- Removed Bouncy Castle Provider dependency from quarkus service module
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants