Skip to content

Conversation

@marcschier
Copy link
Collaborator

@marcschier marcschier commented Oct 17, 2025

Proposed changes

The following changes are contained in this pull request:

  • Allow setting ReturnDiagnostics on the session factory to ensure the diagnostic settings are part of "Create" also. Set them to SymbolicIdAndText by default.
  • Better logging in unit tests, now also logs to progress including exception traces, also prefixes SERVER and TEST on console to attach the logs to the server or client side.
  • GDS client stability improvements to ensure client reliably reconnects between calls when connection is lost.
  • Do not dispose the application certificate when it is updated from GDS client to avoid existing sessions to crash with CryptographicException. This needs more improvements later on because it is rather hacky today.
  • Remove the certificate cache in CertificateFactory (as per discussion and agreement) which allowed re-enabling the older UserIdentity API
  • Logging improvements here and there to analyze issues in unit tests better

Related Issues

  • Fixes #

Types of changes

What types of changes does your code introduce?
Put an x in the boxes that apply. You can also fill these out after creating the PR.

  • Bugfix (non-breaking change which fixes an issue)
  • Enhancement (non-breaking change which adds functionality)
  • Test enhancement (non-breaking change to increase test coverage)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected, requires version increase of Nuget packages)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc.
  • I have signed the CLA.
  • I ran tests locally with my changes, all passed.
  • I fixed all failing tests in the CI pipelines.
  • I fixed all introduced issues with CodeQL and LGTM.
  • I have added tests that prove my fix is effective or that my feature works and increased code coverage.
  • I have added necessary documentation (if appropriate).
  • Any dependent changes have been merged and published in downstream modules.

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

@marcschier marcschier requested a review from Copilot October 17, 2025 08:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Stabilizes GDS/server-related tests and enhances observability by routing previous Console output through structured logging, introducing diagnostics mask configuration, and refactoring telemetry/session creation. Key changes include: (1) pervasive replacement of direct certificate factory usage and constructor overload removals, (2) added diagnostics/telemetry plumbing (ReturnDiagnostics masks, activity source changes), and (3) retry / connection loop refactors plus logging improvements.

Reviewed Changes

Copilot reviewed 75 out of 75 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
Tests/Opc.Ua.Server.Tests/* Removed explicit telemetry injections; rely on default NUnitTelemetryContext and updated activity source usage.
Tests/Opc.Ua.Gds.Tests/* Logging replaces Console writes; GDS test startup simplified; constructors adjusted for new telemetry patterns.
Tests/Opc.Ua.Client.Tests/* Added diagnostics masks to session creation; improved logging; modified retry logic in fixtures.
Tests/Opc.Ua.Client.ComplexTypes.Tests/* MockResolver no longer needs telemetry; updated dictionary loading with cancellation token.
Tests/Common/Logging.cs Reworked NUnit and BenchmarkDotNet logger providers; simplified creation API and added exception formatting helpers.
Stack/Opc.Ua.Core/* Substantial refactors: certificate parsing now uses X509CertificateLoader, ServiceResult / Exception formatting changes, removal of ActivitySource static, added diagnostics handling, updated cryptography & error helpers.
Libraries/Opc.Ua.* Added logging, diagnostics mask plumbing, retry logic refactors, disposal guards, certificate & identity handling changes, and improved error logging.
Applications/Quickstarts.* Updated to new UserIdentity constructors (removed telemetry parameter) and logging changes.
Directory.Packages.props Minor package version bumps.
.editorconfig Expanded analyzer configuration and documentation / formatting comments.
Comments suppressed due to low confidence (8)

Libraries/Opc.Ua.Gds.Server.Common/CertificateGroup.cs:525

  • The CA certificate is wrapped in a using statement then stored (via intermediate assignments not shown) and later returned from Certificates[certificateType]; disposing it here leaves a disposed instance in the cache. Remove the using and manage disposal only for truly temporary instances (or clone before disposing).
            using X509Certificate2 certificate = TryGetECCCurve(certificateType, out ECCurve curve)
                ? builder.SetECCurve(curve).CreateForECDsa()
                : builder
                    .SetHashAlgorithm(
                        X509Utils.GetRSAHashAlgorithmName(Configuration.CACertificateHashSize))
                    .SetRSAKeySize(Configuration.CACertificateKeySize)
                    .CreateForRSA();
#else
            using X509Certificate2 certificate = builder
                .SetHashAlgorithm(
                    X509Utils.GetRSAHashAlgorithmName(Configuration.CACertificateHashSize))
                .SetRSAKeySize(Configuration.CACertificateKeySize)
                .CreateForRSA();

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@codecov
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

❌ Patch coverage is 63.15315% with 409 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.85%. Comparing base (ca43b64) to head (eb50e2a).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...a.Server/Configuration/ConfigurationNodeManager.cs 66.66% 55 Missing and 7 partials ⚠️
...Gds.Client.Common/ServerPushConfigurationClient.cs 72.22% 47 Missing and 13 partials ⚠️
...Security/Certificates/DirectoryCertificateStore.cs 68.32% 43 Missing and 8 partials ⚠️
...a.Gds.Client.Common/GlobalDiscoveryServerClient.cs 79.19% 29 Missing and 7 partials ⚠️
Libraries/Opc.Ua.Client/Session/Session.cs 17.50% 33 Missing ⚠️
...k/Opc.Ua.Core/Stack/Tcp/UaSCBinaryClientChannel.cs 0.00% 18 Missing ⚠️
...Ua.Client/Session/Factory/DefaultSessionFactory.cs 26.08% 17 Missing ⚠️
Stack/Opc.Ua.Core/Types/Utils/ServiceResult.cs 55.26% 17 Missing ⚠️
Stack/Opc.Ua.Core/Stack/Tcp/TcpListenerChannel.cs 28.57% 15 Missing ⚠️
Stack/Opc.Ua.Core/Stack/Server/EndpointBase.cs 46.15% 13 Missing and 1 partial ⚠️
... and 27 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3271      +/-   ##
==========================================
+ Coverage   57.81%   57.85%   +0.03%     
==========================================
  Files         365      365              
  Lines       79423    79723     +300     
  Branches    13870    13852      -18     
==========================================
+ Hits        45920    46123     +203     
- Misses      29338    29459     +121     
+ Partials     4165     4141      -24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@marcschier marcschier marked this pull request as ready for review October 18, 2025 11:51
Copy link
Contributor

@salihgoncu salihgoncu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond the comments I posted, there are a lot of places where lock() statements are used that are not guaranteed to be called from synchronous methods only. These also need to be reviewed and preferrably made async-aware.

@romanett
Copy link
Contributor

@marcschier Sorry to bother again, this PR starts to touch many areas, can you split those two into separate stable ones, we can merge immediately imo:

  • GDS Client Updates
  • Client / Test Logging improvements

I think those are significant enough, so they should be in the history as own commits.

@marcschier
Copy link
Collaborator Author

@romanett - I will not have time to split this once more - this was already split from the switch statements (which had nothing to do with the test stability). This PR addresses the flakiness we see (GDS, certificates) and allows debugging of the issues in the pipelines (logging) and getting it split up again takes me too much time because the tests on the partial change would still be so flaky that the tests need to be restarted numerous times.

Copy link
Contributor

@salihgoncu salihgoncu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This iteration looks much better. - I' fine with it, if nobody has any objections.

{
if (session == Session)
{
Session.Dispose();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should attempt a disconnect here.
Actually it would probably be better to extract the Dispose(&Disconnect) out from the lock/release block, like we do in DisconnectAsync() above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point the session is likely already defunct, and since this is a free threaded callback the less time we spend here the better, let's leave it like this for now IMO.

{
if (Session != null)
{
Session.Dispose();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's gracefully disconnect the session before disposing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to just close the channel as the session is a) defunct, and being reconnected, b) customer has already called DisconnectAsync. Anything in between can be handled with some non-graceful behavior IMO. The code would become a bit unwieldy if we allow disconnect during connect, but in general the usability of these client classes leaves something to be deserved (properties, connect, disconnect, etc.).

@marcschier marcschier merged commit ff0095c into master Oct 20, 2025
79 of 80 checks passed
@marcschier marcschier deleted the teststability2 branch October 20, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants