Skip to content

Reload key and certs in SockerAppender reconnector #2767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Sep 6, 2024
Merged

Conversation

MichaelMorrisEst
Copy link
Contributor

This PR aims to address LOG4J2-2988 by reloading the keystore and truststore from file when getting the SslSocketFactory during a reconnection attempt, as per suggestion in second paragraph of comment: https://issues.apache.org/jira/browse/LOG4J2-2988?focusedCommentId=17260491&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17260491

Signed-off-by: MichaelMorris <michael.morris@est.tech>
Copy link

github-actions bot commented Jul 24, 2024

Job Requested goals Build Tool Version Build Outcome Build Scan®
build-macos-latest clean install 3.9.8 Build Scan PUBLISHED
build-ubuntu-latest clean install 3.9.8 Build Scan PUBLISHED
build-windows-latest clean install 3.9.8 Build Scan PUBLISHED
Generated by gradle/develocity-actions

@vy
Copy link
Member

vy commented Jul 26, 2024

I'm not in favor of repurposing monitorInterval to periodically reload SSL key and trust stores, because

  • This convolutes the meaning of monitorInterval
  • Polling interval of Log4j configuration file for changes != polling interval of a particular external resource for changes

@MichaelMorrisEst, @ppkarwasz, what do you think of the following instead?

  1. Add the reloadInterval property (of type Duration) to SslConfiguration
  2. Use Configuration#getScheduler().schedule() periodically update SslConfiguration#sslContext

@ppkarwasz
Copy link
Contributor

Reloading the cryptographic material alone does not solve the problem in LOG4J2-2988: SocketAppender will not close a TLS socket if the SslConfiguration changes, even after a reconfiguration event.

In fact two SocketAppenders that use the same host and port will share the same SocketManager. In order to establish a new TLS socket, we must:

  1. Include some characteristic of SslConfiguration in the name of the SocketManager (e.g. the issuer and serial number of the client certificate),
  2. Reload the TLS socket. The easiest way to do it is to trigger a Log4j Core reconfiguration.
  1. Add the reloadInterval property (of type Duration) to SslConfiguration

I would have used monitorInterval if there was a way to check the last modification time of a KeyStore.
However key stores are not necessarily file-based, so reloadInterval sounds good to me.

2. Use `Configuration#getScheduler().schedule()` periodically update `SslConfiguration#sslContext`

I think the scheduler could just call LoggerContext.reconfigure() if the serial number of the certificate in the KeyStore changed.

@vy
Copy link
Member

vy commented Jul 27, 2024

For one, I don't think it is correct to keep the state (i.e., SSLContext) in configuration DTOs. I think the state should have rather been implemented as a part of SslSocketManager. But I guess it is too late for that given SslConfiguration#getSslContext() et al. As a result, we need to manage the state in several places: SslConfiguration, SslSocketManager, SocketAppender, etc.

@ppkarwasz, in the light of your feedback, I revised my proposal as follows:

  1. Introduce getId() to SslConfiguration and use it in the name provided to getManager() in SslSocketManager. This will ensure SslSocketManager will evict old connections upon a reconfiguration attempt with a different SslConfiguration#getId() value.
  2. Introduce reloadInterval to SslConfiguration and use it with Configuration#getScheduler().schedule() in SocketAppender to periodically call Configuration#getLoggerContext().reconfigure() – iff, reloadInterval is provided and protocol is SSL.

Reload the TLS socket. The easiest way to do it is to trigger a Log4j Core reconfiguration.

@ppkarwasz, is it so? Can't we instead create a new DelegatingOutputStreamManager extending from SslSocketManager that allows gracefully swapping (i.e., start the new, set the new, stop the old) the underlying SslSocketManager with another one?

@ppkarwasz
Copy link
Contributor

@ppkarwasz, in the light of your feedback, I revised my proposal as follows:

1. Introduce `getId()` to `SslConfiguration` and use it in the `name` provided to `getManager()` in `SslSocketManager`. This will ensure `SslSocketManager` will evict old connections upon a reconfiguration attempt with a different `SslConfiguration#getId()` value.

Sounds good to me. We can generate the id based on the issuers and serial numbers of the certificates in the two key stores.

2. Introduce `reloadInterval` to `SslConfiguration` and use it with `Configuration#getScheduler().schedule()` in `SocketAppender` to periodically call `Configuration#getLoggerContext().reconfigure()` – iff, `reloadInterval` is provided and `protocol` is SSL.

Do we really need to poll for configuration changes? Apache Tomcat does not poll the key stores for changes (see Can Tomcat reload its SSL certificate without being restarted?) forcing users to either restart the server when the server certificate changes or to use JMX.

Since the OP of LOG4J2-2988 only asks to reload the keystores after a reconfiguration, I think we don't need to introduce reloadInterval.

I am not strongly opinionated about this, but I believe that touching the configuration file after the keystore has been modified might be enough.

Reload the TLS socket. The easiest way to do it is to trigger a Log4j Core reconfiguration.

@ppkarwasz, is it so? Can't we instead create a new DelegatingOutputStreamManager extending from SslSocketManager that allows gracefully swapping (i.e., start the new, set the new, stop the old) the underlying SslSocketManager with another one?

TcpSocketManager already has a Reconnector thread that asynchronously reconnects the channel. I guess we could use Reconnector to establish a new socket after the contents of the key stores changes.

I admit, I have mixed feelings about this:

  • on one hand we put a lot of effort to have a reliable reconfiguration process. This process dictates half of Log4j Core architecture. We also discourage users from modifying the configuration in code.
  • on the other hand we have more and more components that modify their configuration at runtime, e.g. MutableThreadContextMapFilter.

@vy
Copy link
Member

vy commented Jul 30, 2024

LOG4J2-2988 only asks to reload the keystores after a reconfiguration

@ppkarwasz, you've have a point. Given this, aren't we practically done after introducing SslConfiguration#getId() (obtained using the issuers and serial numbers of the certificates in the two key stores) and incorporating it into the name of the manager?

@ppkarwasz
Copy link
Contributor

LOG4J2-2988 only asks to reload the keystores after a reconfiguration

@ppkarwasz, you've have a point. Given this, aren't we practically done after introducing SslConfiguration#getId() (obtained using the issuers and serial numbers of the certificates in the two key stores) and incorporating it into the name of the manager?

Yes, this would be my preferred approach.

@vy
Copy link
Member

vy commented Jul 30, 2024

@MichaelMorrisEst, would you be interested in updating your PR with the following changes?

  1. Introduce the new SslConfiguration#getId() method returning an identifier String obtained using the issuers and serial numbers of the certificates in the two key stores
  2. In SslSocketManager, incorporate the new SslConfiguration#getId() while determining the name passed to getManager()

@MichaelMorrisEst
Copy link
Contributor Author

Thanks @vy and @ppkarwasz for the review and the great discussion. I will update the PR with the suggested approach

@ppkarwasz ppkarwasz added this to the 2.25.0 milestone Aug 16, 2024
Signed-off-by: MichaelMorris <michael.morris@est.tech>
@vy
Copy link
Member

vy commented Aug 26, 2024

@MichaelMorrisEst, we have noticed your changes, thanks! We are busy with getting the 2.24.0 out. I will take care of this PR after reviewing changes that should go into 2.24.0.

@vy
Copy link
Member

vy commented Aug 29, 2024

@MichaelMorrisEst, we discussed how to proceed this with @ppkarwasz and we would like to carry out some further changes regarding how the SSL-related resources are generated. Would you mind giving myself and @ppkarwasz write access to your fork? So that we can push our changes to your branch and stick to this PR effectively.

vy added 2 commits August 30, 2024 11:54
- Revamp SSL key store generation script
- Remove redundant SSL certificates, key stores, etc.
- Improve SSL tests
@MichaelMorrisEst
Copy link
Contributor Author

No problem, you should have received notification, any problems let me know

@vy vy self-assigned this Sep 3, 2024
@vy vy requested a review from ppkarwasz September 3, 2024 11:30
@vy vy added the appenders Affects one or more Appender plugins label Sep 3, 2024
@vy vy added the bug Incorrect, unexpected, or unintended behavior of existing code label Sep 3, 2024
Copy link
Contributor

@ppkarwasz ppkarwasz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 💯

I think this PR needs a little bit of massaging to remove most of the complexity of SslConfiguration. We should consider only two cases:

  • the user provides valid configuration values for all parameters,
  • one configuration parameter is invalid. In this case we refuse to connect via TLS.

@vy vy merged commit 51e34a3 into apache:2.x Sep 6, 2024
6 checks passed
@vy
Copy link
Member

vy commented Sep 6, 2024

@MichaelMorrisEst, this fix will be delivered in 2.25.0, FYI. Thanks so much for your patience and contribution.

@ppkarwasz
Copy link
Contributor

Remember to port it to main.

@ppkarwasz ppkarwasz deleted the log4j2-2988 branch September 6, 2024 19:24
vy pushed a commit that referenced this pull request Sep 9, 2024
Co-authored-by: Michael Morris <michael.morris@est.tech>
@vy vy added the configuration Affects the configuration system in a general way label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
appenders Affects one or more Appender plugins bug Incorrect, unexpected, or unintended behavior of existing code configuration Affects the configuration system in a general way
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants