Skip to content

Document limits on static and dynamic linking for HPE NonStop platforms.#19952

Closed
rsbeckerca wants to merge 1 commit intoopenssl:openssl-3.1from
rsbeckerca:fix-19951
Closed

Document limits on static and dynamic linking for HPE NonStop platforms.#19952
rsbeckerca wants to merge 1 commit intoopenssl:openssl-3.1from
rsbeckerca:fix-19951

Conversation

@rsbeckerca
Copy link
Contributor

@rsbeckerca rsbeckerca commented Dec 21, 2022

Documentation is necessary as static and dynamic linking cause SIGSEGV during atexit() processing on the platform.

Fixes: #19951

Signed-off-by: Randall S. Becker randall.becker@nexbridge.ca

Checklist
  • documentation is added or updated

@t8m t8m added branch: master Applies to master branch approval: review pending This pull request needs review by a committer approval: otc review pending triaged: documentation The issue/pr deals with documentation (errors) branch: 3.0 Applies to openssl-3.0 branch branch: 3.1 Applies to openssl-3.1 (EOL) tests: exempted The PR is exempt from requirements for testing labels Dec 22, 2022
NOTES-NONSTOP.md Outdated
Comment on lines 53 to 56
libraries and dynamically load OpenSSL DLLs concurrently. If this is done,
there is a high probability of encountering a SIGSEGV condition relating to
`atexit()` processing when the DLL is unloaded and when the program terminates.
This limitation applies to all OpenSSL DLL components.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd replace DLL with shared library as the DLL is kind of Windows-specific acronym.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make that change imminently.

@t8m
Copy link
Member

t8m commented Dec 22, 2022

This should be merged to master, 3.0, and 3.1 branches.

Documentation is necessary as static and dynamic linking cause SIGSEGV
during atexit() processing on the platform.

Fixes: 19951

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>
@rsbeckerca
Copy link
Contributor Author

This should be merged to master, 3.0, and 3.1 branches.

Do you need separate PRs for that?

@t8m
Copy link
Member

t8m commented Dec 22, 2022

This should be merged to master, 3.0, and 3.1 branches.

Do you need separate PRs for that?

No, I assume this will cherry-pick cleanly.

@rsbeckerca
Copy link
Contributor Author

This should be merged to master, 3.0, and 3.1 branches.

Do you need separate PRs for that?

No, I assume this will cherry-pick cleanly.

It should. I have not mucked with that file since early 3.0.

@mattcaswell mattcaswell added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels Jan 3, 2023
@bernd-edlinger
Copy link
Member

I think this description is incomplete and does not reflect the reason why
90-test-shlibload.t has been disabled on NonStop:

plan skip_all => "Test is disabled on NonStop" if config('target') =~ m|^nonstop|;

as far as I understand the issue, a crash will also happen if the application
does NOT link to libcrypto.so and loads a library that links to libcrypto.so

@openssl-machine
Copy link
Collaborator

24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually.

@t8m
Copy link
Member

t8m commented Jan 4, 2023

Before merging I am waiting on @rsbeckerca to respond to Bernd's comment above.

@levitte
Copy link
Member

levitte commented Jan 4, 2023

To be noted is that this is potentially not a NonStop problem alone...

@rsbeckerca
Copy link
Contributor Author

I think this description is incomplete and does not reflect the reason why 90-test-shlibload.t has been disabled on NonStop:

plan skip_all => "Test is disabled on NonStop" if config('target') =~ m|^nonstop|;

as far as I understand the issue, a crash will also happen if the application
does NOT link to libcrypto.so and loads a library that links to libcrypto.so

The ordinals for entry points in DLLs do not line up on NonStop with the Linux linker based when this test failed in the 3.0.0 series. The requirement to have a consistent set of ordinals seems to be architecture-specific and not necessary, in my opinion. I can document this in a separate PR, if requested, but I think 90-test_shlibload has nothing to do with this specific issue.

@bernd-edlinger
Copy link
Member

Hmm, I don't understand at all.
This test uses a stand-along executable, does dynamically load the libcrypto.so
and get the address of several exported functions by name.
calls them up. and is specifically designed to check that the atexit callback is called or not,
dependent on whether OPENSSL_init_crypto(OPENSSL_INIT_NO_ATEXIT, NULL) is called
or not.

@rsbeckerca
Copy link
Contributor Author

Hmm, I don't understand at all. This test uses a stand-along executable, does dynamically load the libcrypto.so and get the address of several exported functions by name. calls them up. and is specifically designed to check that the atexit callback is called or not, dependent on whether OPENSSL_init_crypto(OPENSSL_INIT_NO_ATEXIT, NULL) is called or not.

The following is what happens now when I enable the test, and what happened when I reported this originally:

90-test_shlibload.t ..
# The results of this test will end up in test-runs/test_shlibload
1..10
../../util/wrap.pl ../../test/shlibloadtest -crypto_first libcrypto.so libssl.so atexit-cryptofirst.txt => 139
not ok 1 - running shlibloadtest -crypto_first atexit-cryptofirst.txt

#   Failed test 'running shlibloadtest -crypto_first atexit-cryptofirst.txt'
#   at test/recipes/90-test_shlibload.t line 34.
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.
not ok 2

#   Failed test at test/recipes/90-test_shlibload.t line 36.
../../util/wrap.pl ../../test/shlibloadtest -ssl_first libcrypto.so libssl.so atexit-sslfirst.txt => 139
not ok 3 - running shlibloadtest -ssl_first atexit-sslfirst.txt

#   Failed test 'running shlibloadtest -ssl_first atexit-sslfirst.txt'
#   at test/recipes/90-test_shlibload.t line 40.
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.
not ok 4

#   Failed test at test/recipes/90-test_shlibload.t line 42.
../../util/wrap.pl ../../test/shlibloadtest -just_crypto libcrypto.so libssl.so atexit-justcrypto.txt => 139
not ok 5 - running shlibloadtest -just_crypto atexit-justcrypto.txt

#   Failed test 'running shlibloadtest -just_crypto atexit-justcrypto.txt'
#   at test/recipes/90-test_shlibload.t line 46.
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.
not ok 6

#   Failed test at test/recipes/90-test_shlibload.t line 48.
DSO_dsobyaddr() failed
../../util/wrap.pl ../../test/shlibloadtest -dso_ref libcrypto.so libssl.so atexit-dsoref.txt => 139
not ok 7 - running shlibloadtest -dso_ref atexit-dsoref.txt

#   Failed test 'running shlibloadtest -dso_ref atexit-dsoref.txt'
#   at test/recipes/90-test_shlibload.t line 52.
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.
not ok 8

#   Failed test at test/recipes/90-test_shlibload.t line 54.
../../util/wrap.pl ../../test/shlibloadtest -no_atexit libcrypto.so libssl.so atexit-noatexit.txt => 0
ok 9 - running shlibloadtest -no_atexit atexit-noatexit.txt
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.
ok 10
# Looks like you failed 8 tests of 10.
Dubious, test returned 8 (wstat 2048, 0x800)
Failed 8/10 subtests

Test Summary Report
-------------------
90-test_shlibload.t (Wstat: 2048 Tests: 10 Failed: 8)
  Failed tests:  1-8
  Non-zero exit status: 8
Files=1, Tests=10,  1 wallclock secs ( 0.01 usr  0.00 sys +  0.58 cusr  0.01 csys =  0.61 CPU)
Result: FAIL```

@bernd-edlinger
Copy link
Member

bernd-edlinger commented Jan 4, 2023

Okay, now this is what happens 4 times:

../../util/wrap.pl ../../test/shlibloadtest -crypto_first libcrypto.so libssl.so atexit-cryptofirst.txt => 139

139 is not the normal result code of shlibloadtest, which is 0 = ok, or 1 = "normal" error

#   at test/recipes/90-test_shlibload.t line 34.
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.

the atexit callback was not called when expected.

But 2 test cases are passed:

../../util/wrap.pl ../../test/shlibloadtest -no_atexit libcrypto.so libssl.so atexit-noatexit.txt => 0
ok 9 - running shlibloadtest -no_atexit atexit-noatexit.txt
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.
ok 10

exactly the only test case with -no_atexit, does not crash.
note the readline() on closed filehandle is a glitch, that is probably normal,
and does not make the test failed. because the test expectation is that the atexit handler
is not called and the result file does not exist as expected.

So to me this looks like exactly the same issue.

@rsbeckerca
Copy link
Contributor Author

Okay, now this is what happens 4 times:

../../util/wrap.pl ../../test/shlibloadtest -crypto_first libcrypto.so libssl.so atexit-cryptofirst.txt => 139

139 is not the normal result code of shlibloadtest, which is 0 = ok, or 1 = "normal" error

#   at test/recipes/90-test_shlibload.t line 34.
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.

the atexit callback was not called when expected.

But 2 test cases are passed:

../../util/wrap.pl ../../test/shlibloadtest -no_atexit libcrypto.so libssl.so atexit-noatexit.txt => 0
ok 9 - running shlibloadtest -no_atexit atexit-noatexit.txt
readline() on closed filehandle $fh at test/recipes/90-test_shlibload.t line 68.
ok 10

exactly the only test case with -no_atexit, does not crash. note the readline() on closed filehandle is a glitch, that is probably normal, and does not make the test failed. because the test expectation is that the atexit handler is not called and the result file does not exist as expected.

So to me this looks like exactly the same issue.

The 139 is typically a SIGSEGV. I guess it is the same issue in that case. Do you want the two tests mentioned in the updated documentation as a result?

@bernd-edlinger
Copy link
Member

I have no idea how to properly describe this mess :-(
Can you try out if this fixes the SIGSEGV in the shlibloadtest ?

diff --git a/test/simpledynamic.c b/test/simpledynamic.c
index 2cced8c..56fbaf2 100644
--- a/test/simpledynamic.c
+++ b/test/simpledynamic.c
@@ -22,6 +22,7 @@ int sd_load(const char *filename, SD *lib, int type)
     if (filename[strlen(filename) - 1] == ')')
         dl_flags |= RTLD_MEMBER;
 #endif
+    (void) dlopen(filename, dl_flags);
     *lib = dlopen(filename, dl_flags);
     return *lib == NULL ? 0 : 1;
 }

@t8m
Copy link
Member

t8m commented Jan 10, 2023

I am giving an option to put hold on this PR if somebody does it before tomorrow. Otherwise I'll merge it as it is better than nothing.

@t8m
Copy link
Member

t8m commented Feb 8, 2023

Merged to master, 3.1, and 3.0 branches. Thank you for your contribution.

Anybody is welcome to submit a PR that improves the text further.

@t8m t8m closed this Feb 8, 2023
openssl-machine pushed a commit that referenced this pull request Feb 8, 2023
Documentation is necessary as static and dynamic linking cause SIGSEGV
during atexit() processing on the platform.

Fixes: 19951

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #19952)
openssl-machine pushed a commit that referenced this pull request Feb 8, 2023
Documentation is necessary as static and dynamic linking cause SIGSEGV
during atexit() processing on the platform.

Fixes: 19951

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #19952)

(cherry picked from commit e80518d)
openssl-machine pushed a commit that referenced this pull request Feb 8, 2023
Documentation is necessary as static and dynamic linking cause SIGSEGV
during atexit() processing on the platform.

Fixes: 19951

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #19952)

(cherry picked from commit e80518d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval: done This pull request has the required number of approvals branch: master Applies to master branch branch: 3.0 Applies to openssl-3.0 branch branch: 3.1 Applies to openssl-3.1 (EOL) tests: exempted The PR is exempt from requirements for testing triaged: documentation The issue/pr deals with documentation (errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants