Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected cast of True to 0 in ctypes on macOS ARM #107844

Open
jaraco opened this issue Aug 10, 2023 · 8 comments
Open

Unexpected cast of True to 0 in ctypes on macOS ARM #107844

jaraco opened this issue Aug 10, 2023 · 8 comments
Labels
OS-mac topic-ctypes type-bug An unexpected behavior, bug, or error

Comments

@jaraco
Copy link
Member

jaraco commented Aug 10, 2023

In openssl/openssl#21709, I reported an issue with OpenSSL where I was passing encrypt=True to a C function in OpenSSL but getting an error that implied that encrypt=False. On all Pythons and platforms except macOS for ARM (including Windows ARM), the encrypt value is treated as 1 and the routine works as expected. On macOS ARM, however, the behavior is different and the True value is received by the C API as 0.

This issue occurs only when the function signature is underspecified. That is, when the argtypes has not specified the parameter.

I haven't yet distilled a minimal repro for Python, but I did distill a minimal repro for OpenSSL in this gist.

What I would expect instead:

  • If accepted as a parameter, True should resolve to 1 (or at least non-zero).
  • Behavior should be consistent across platforms.
  • Either argtypes must be fully specified (and error if parameters are passed for underspecified argtypes) or calls should provide reasonable default transformations for unspecified arguments.
@jaraco jaraco added the type-bug An unexpected behavior, bug, or error label Aug 10, 2023
@jaraco jaraco removed the topic-SSL label Aug 10, 2023
@jaraco
Copy link
Member Author

jaraco commented Aug 10, 2023

I've unflagged topic-SSL, because the use of OpenSSL here is most likely incidental to the issue and because it doesn't involve Python's use of SSL.

@ronaldoussoren
Copy link
Contributor

The gist doesn't give an error for me if I change lib_path to "/Library/Frameworks/Python.framework/Versions/3.12/lib/libcrypto.3.dylib" with Python 3.12rc1 installed using the python.org installer.

MacOS/arm64 has different calling conventions for regular and variadic functions as described here, which in general requires defining argtypes for the fixed arguments. That doesn't appear to be an issue with this gist though.

@ned-deily
Copy link
Member

Perhaps a different version of libffi, i.e.not using the macOS system libffi?

@jaraco
Copy link
Member Author

jaraco commented Aug 10, 2023

I may have caused some confusion with the gist. Version 1 of the gist fails. version 2 works around the failure and runs without fail.

@jaraco
Copy link
Member Author

jaraco commented Aug 10, 2023

I've got Python 3.12b4 with libcrypto.1.1.dylib and I'm able to replicate the failure with that library.

@ronaldoussoren
Copy link
Contributor

I can reproduce the issue with version 1 of the gist, this gives the following error:

Error finalizing cipher: error:1C80006B:Provider routines::wrong final block length

Version 1 contains:

encrypt = True
lib.EVP_CipherInit_ex.argtypes = (
    ctypes.POINTER(Cipher),
    ctypes.POINTER(CipherType),
    ctypes.c_void_p,
    ctypes.c_char_p,
    ctypes.c_char_p,
)

res = lib.EVP_CipherInit_ex(ctx, cipher.contents, engine, key, iv, encrypt)

This calls EVP_CipherInit_ex with one argument more than declared in argtypes. This triggers passing the last argument using the calling conventions for variadic functions.

BTW. I have never looked into what the difference in calling convention entails, other than being told by an Apple engineer that the calling convention is different for the variadic parts of a call.

I'm afraid there is no bug here, this is expected behaviour.

It might be possible to add an attribute that marks a callable as having variadic arguments, but that would have to default to the current behaviour for backward compatibility reasons. Because of this such a flag would just complicate the interface with few to no advantages.

@jaraco
Copy link
Member Author

jaraco commented Aug 11, 2023

This calls EVP_CipherInit_ex with one argument more than declared in argtypes. This triggers passing the last argument using the calling conventions for variadic functions.

Oh, interesting. Thanks for the insight. In this case, the function isn't variadic:

int EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *type,
                      ENGINE *impl, const unsigned char *key, const unsigned char *iv, int enc);

The reason that only some of the args were defined was because the signature is nearly identical across the "init" functions, (presumably) varying only by the enc arg. Looking more deeply into the signatures, they vary by more than just the enc arg. The Encrypt and Decrypt variants don't take an ENGINE * either, so those signatures are just broken.

Closing the loop, it seems that the reason the function works on every other platform is because the calling convention is the same, so the unspecified argument would get passed normally except on Apple's ARM. I would have never thought to consult the section on variadic functions, because (a) I authored this code 23 years ago and (b) I'm not working with variadic functions.

It might be possible to add an attribute that marks a callable as having variadic arguments, but that would have to default to the current behaviour for backward compatibility reasons. Because of this such a flag would just complicate the interface with few to no advantages.

I'm following, but I think there may still be opportunity for improvement.

Why not consider breaking compatibility and resolving the ambiguity:

  • In Python (next), add an attribute marking a callable as having variadic arguments (or maybe a marker like argtypes = [ctypes.Variable(ctypes.char_p)]). Detect when variadic is inferred and warn about its use.
  • In Python (next+2), only use the variadic convention when specified and otherwise pass extra parameters using the normal convention or fail hard when calling with more arguments than specified.

This approach would make the calling convention more visible and safer and avoid this foot gun, one which could easily manifest as a security issue. It would immediately raise awareness of the issue and link to a resource that affected users could use to avoid the mistake.

@ronaldoussoren
Copy link
Contributor

This calls EVP_CipherInit_ex with one argument more than declared in argtypes. This triggers passing the last argument using the calling conventions for variadic functions.

Oh, interesting. Thanks for the insight. In this case, the function isn't variadic:

I know, but it the call is deduced to be variadic because the number of actual arguments is larger than the number of declared arguments.

At the time this logic was added I didn't consider anyone leaving off arguments from argtypes other than for variadic functions. That's not something I'd ever do, and I'd likely flag such code in reviews as incorrect. That said, I'm at best a very light user of ctypes.

[...]

Closing the loop, it seems that the reason the function works on every other platform is because the calling convention is the same, so the unspecified argument would get passed normally except on Apple's ARM. I would have never thought to consult the section on variadic functions, because (a) I authored this code 23 years ago and (b) I'm not working with variadic functions.

The ctypes documentation in general could use some love, it currently is a weird mix of reference documentation and a tutorial on how to use it. I've added the section about variadic arguments to match style, but definitely wouldn't claim that section is easily discoverable.

BTW. This likely affects all arm platforms with hard float. I've done some spelunking through libffi's history and ended up at this patch: https://patches.linaro.org/project/libffi-discuss/patch/20110222154022.GA29862@davesworkthinkpad/. Libffi's GitHub contains an indirect reference to this patch ("Add David Gilbert's variadic function call support").

It might be possible to add an attribute that marks a callable as having variadic arguments, but that would have to default to the current behaviour for backward compatibility reasons. Because of this such a flag would just complicate the interface with few to no advantages.

I'm following, but I think there may still be opportunity for improvement.

Why not consider breaking compatibility and resolving the ambiguity:

  • In Python (next), add an attribute marking a callable as having variadic arguments (or maybe a marker like argtypes = [ctypes.Variable(ctypes.char_p)]). Detect when variadic is inferred and warn about its use.
  • In Python (next+2), only use the variadic convention when specified and otherwise pass extra parameters using the normal convention or fail hard when calling with more arguments than specified.

This approach would make the calling convention more visible and safer and avoid this foot gun, one which could easily manifest as a security issue. It would immediately raise awareness of the issue and link to a resource that affected users could use to avoid the mistake.

I've thought about this some more, and I agree. Not just because of this particular issue, but also because we currently have no way to call variadic functions where some of the variadic arguments need explicit annotations in argtypes.

There's two ways to do this:

  1. Add a marker for the last regular or first variadic argument to argtypes (e.g. your proposal)
  2. Add a new attribute that marks the index of the first variadic argument

I currently prefer the first option, but haven't really thought about the implications of the various options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-mac topic-ctypes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants