-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Foreign Function Interface (FFI) implementation #46905
base: main
Are you sure you want to change the base?
Conversation
Review requested:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should disable it by default when using the Permission model, right?
2a74523
to
5e33eeb
Compare
The actual approach looks neat and the changes look like the correct approach to do this. Still I have to ask - is this common enough to be in core and not a userland module? |
With desktop applications built on Chromium + Node.js frameworks (Electron, nw.js, etc) becoming the common standard, having a means to interact with native libraries without needing to reach for third party modules would be a big improvement. |
Calling into native code is common enough, given the myriad native modules on npm. The primary motivation for FFI, at least from my perspective, is the ability to call into native code without the usage of an additional C++ binding layer. This comes up particularly with pre-built dynamic libraries. For example, the official Oracle DB driver requires that a closed-source dynamic library be installed. This means that (currently) an additional C++ addon must be either compiled or prebuilt. With FFI built into core, the close-source dynamic library can be used without a C++ addon. For another fun example,
The real win here is less of a need for native addons. |
This is a common problem integrating applications with Steamworks (the API for Steam). The library cannot be publicly distributed and the popular native addon for it is now unmaintained and broken. With FFI, this wouldn't be an issue for a lot of people. |
Hello from Deno world! Nice to see FFI in Node as well: At least I've been completely taken by it and have spent significant amounts of time both thinking about and contributing to it on Deno side. Few things that I would bring up from that experience:
I implemented FFI callbacks for Deno about a year ago, I think? The runtime backends are of course quite different so in that sense I doubt there's much that I could "teach" you. One thing that does spring to mind about that is, however, again regarding V8 Fast API: The Fast API has a requirement that the calls may not call back into V8 / JavaScript. This means that for FFI API declarations (if you take Fast API into use), you'll need to have some sort of flag to opt out of Fast API. For Deno that's This opt out needs to happen in sometimes surprising places. eg. I've been playing with creating C++ lambdas from JavaScript. When passing a lambda into an API, any copies made of the lambda will callback, so APIs that pass a lambda or that may cause a lambda (created by me) to be passed on the native side need the I hope some of this may be useful to you! If you want to know more about how Deno's FFI works, I've written a bunch about it here. Feel free to ping me or send me an email or something if you have any direct questions you would want to ask! And if I am speaking out of turn, my apologies: I absolutely mean no offense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea to make FfiSignature
an ObjectWrap
that could be wrapped around the function created in getNativeFunction
that way the destructor could be used to clean up the mallocs for cif
and argv
.
src/node_ffi.cc
Outdated
libraries[fname] = lib; | ||
bool success = lib->Open(); | ||
if (!success) { | ||
// TODO(bengl) what now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throw and cleanup libraries[fname]
?
src/node_ffi.cc
Outdated
void GetBufferPointer(const v8::FunctionCallbackInfo<Value>& args) { | ||
CHECK(args[0]->IsArrayBuffer()); | ||
void* data = args[0].As<ArrayBuffer>()->Data(); | ||
|
||
ReturnBigInt(args.GetIsolate(), data); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this not happen purely on the JS side? Seems to me like it would be faster to avoid the native barrier hop here if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the Buffer types have pointer addresses available on the JS side.
We could add them to Buffer
, and special case those? Also, there's definitely an opportunity to cache the address per ArrayBuffer, so that calls to other ArrayBuffer views on the same underlying ArrayBuffer can be handled in JS.
@aapoalas Thanks for all the info and ideas! Especially your notes on callbacks, and this set of notes.
Yes, before I take this PR out of Draft mode, I'll be ensuring that it's covered by the permission model. That being said, I don't see this as being any more dangerous (from a security perspective) than being able to load native addons.
One of the goals here is to enable pointer math. I'd prefer not having to require multiple buffers in order to pass multiple pointers into a function. For example, in a test file, I have this: Lines 75 to 81 in 5e33eeb
I could enable this with a wrapper class, and get a new The biggest issue with this, right now, is that my implementation stuffs all arguments and a slot for the return value into a Buffer (which is re-used for each call). I don't know how I would do that if pointers were
Hmm, it sounds like this might be because of implementation details. It might not actually apply here?
I don't think it would make sense to only support strings. Pointers (or Buffers, if taking that approach) would stuff need to be supported. Strings could also be cached. |
No problem, I hope it's helpful!
Yeah, it's not opening more doors really, but it is much easier to do nasty stuff with FFI since it becomes possible to keep most of the work inside JS.
Deno only in the most recent minor version update changed from pointer numbers to pointer objects. With that change I added a few APIs: Creating a pointer from a number (dangerous!), offsetting a pointer, and getting the numeric value of a pointer. These of course exist to enable pointer arithmetic when needed. Only today I changed one private repo to use the new API and admittedly assigning pointers to buffers became harder, one might even say tedious. Still, it wasn't a terribly common use case, at least in that code base. In essence, I'd argue that the performance that Fast API offers is more important :) but that can of course be construed to be "sour grapes" :D
I'm no security expert, but isn't pointer spoofing one relatively common attack? Any number always being a pointer does feel a bit scary to me in that sense :) From an ergonomics point of view it's also a bit unfortunate that there is no difference between a pointer and a number. Side note: ARM might have pointer cryptography turned on, in which case pointer arithmetics would need to prepare for number OR BigInt values.
Ah okay, yeah I hadn't realized you're doing this in a totally different way. Deno passes the parameters "traditionally".
Yeah, since you're just passing a buffer, we're on very different paths :) It's a very intriguing choice. Due to Deno's reliance on V8 Fast API it wouldn't make sense to us, but without that and JIT compiling the native side glue code it definitely makes sense. Nice stuff!
I'm not sure how strings could be cached in a reasonable way, since you'd want to avoid leaking memory and thus would need to bind the encoded string's buffer into the lifetime of the string itself, somehow, but strings cannot be used as keys in WeakMaps. So, it would then make sense to just encode a string once and return the buffer to the user, after which they can manage the lifetime. This then just becomes essentially the same as TextDecoder, I think. |
Actually, how should this work? Should it just be enabled or disabled, like with worker threads or child processes? Since the filename parameter to |
I think this should be a separate permission (similar to child process and worker) with its own flag. When the permission system is enabled, access to FFI should be restricted unless the |
Exactly. The safest option is to disable it by default when using |
@RafaelGSS In the most recent rev, I have it behaving exactly as the worker and child process permissions do (that is, it's only disabled if an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest including a test for the new CLI argument (--allow-ffi
). See:
test-permission-deny-worker-threads.js
test-permission-deny-worker-threads-cli.js
test-permission-deny-allow-worker-cli.js
doc/api/cli.md
Outdated
@@ -547,6 +589,7 @@ following permissions are restricted: | |||
\[`--allow-fs-read`]\[],\[`allow-fs-write`]\[] flags | |||
* Child Process - manageable through \[`--allow-child-process`]\[] flag | |||
* Worker Threads - manageable through \[`--allow-worker`]\[] flag | |||
* FFI - manageable through \[`--allow-ffo`]\[] flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* FFI - manageable through \[`--allow-ffo`]\[] flag | |
* FFI - manageable through \[`--allow-ffi`]\[] flag |
@bengl I've just seen the patch. All good. |
More of a practical matter: I suggest removing unused big files from libffi (like the ChangeLog files) to avoid bloating the git tree and the source tarballs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions to simplify things slightly, but they're minor nits. Other than that, LGTM.
src/node_ffi.cc
Outdated
|
||
CHECK(args[0]->IsBigInt()); | ||
binding::DLib* lib = | ||
(binding::DLib*)args[0].As<BigInt>()->Uint64Value(&lossless); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the lossless argument if you aren't going to read it. Perhaps we should be reading it though and throwing if it's lossy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all the cases here where we're calling Uint8Value()
, it's always with a pointer retrieved from this binding (not even exposed to the user). I think we're probably fine just getting rid of that argument everywhere in this file.
src/node_ffi.cc
Outdated
static bool lossless = false; | ||
|
||
#define ReturnBigInt(isolate, ptr) \ | ||
args.GetReturnValue().Set(BigInt::NewFromUnsigned(isolate, (uint64_t)ptr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calls to this seem to use args.GetIsolate()
and env->isolate()
interchangeably, but it always needs args
present anyway. Could probably just drop the isolate parameter and make it always use args.GetIsolate()
internally.
src/node_ffi.cc
Outdated
bool isNull = true; | ||
std::string fname = ""; | ||
if (!args[0]->IsNull()) { | ||
CHECK(args[0]->IsString()); | ||
node::Utf8Value filename(env->isolate(), args[0]); | ||
fname = filename.ToString(); | ||
isNull = false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could simplify that to something like:
std::string fname = "";
CHECK(args[0]->IsNull() || args[0]->IsString());
if (args[0]->IsString()) {
fname = *node::Utf8Value(env->isolate(), args[0]);
}
Then you can check fname.empty()
rather than needing isNull
.
related art (from Java): https://openjdk.org/projects/panama/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we target Node.js >= 21?
I don't see why this should be a semver major change. Backporting to v20 would be a great win. |
``` | ||
|
||
```cjs | ||
const ffi = require('ffi'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be node:ffi
, and we need to ensure that the node:
prefix is required so that this can be backported.
|
||
<!--introduced_in=REPLACEME--> | ||
|
||
> Stability: 1 - Experimental |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> Stability: 1 - Experimental | |
> Stability: 1.1 - Active development |
Looks like this didn't make 21. Let me know if there's anything I can help with from the libffi side. I expect to make a new release in the next few weeks. |
} = require('internal/errors').codes; | ||
|
||
// [ sigPtr | fnPtr | rvalue | avaluesPointers ] | ||
const callBuffer = new ArrayBuffer(4096); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking nit: likely worth making 4096
a constant.
'signed long int': 'long', | ||
'unsigned long': 'ulong', | ||
'unsigned long int': 'ulong', | ||
// 'long long': , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here about why these are commented out would be good
if (lib !== null && typeof lib !== 'string') { | ||
throw new ERR_INVALID_ARG_TYPE('library', ['string', 'null'], lib); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use validateString()
helper
const libCache = {}; | ||
|
||
const POINTER_SIZE = sizes['char*']; | ||
const NULL = Symbol('null'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const NULL = Symbol('null'); | |
const kNULL = Symbol('null'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! A few nits:
setCallBuffer, | ||
getBufferPointer: getBufferPointerInternal, | ||
FfiSignature, | ||
makeCall, | ||
getSymbol, | ||
getLibrary, | ||
types, | ||
sizes, | ||
charIsSigned, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider sorting the entries in ASCII order
if (type.includes('*')) return typesToFfiTypes.pointer; | ||
if (type.endsWith('_t')) return type.replace(/_t$/, ''); | ||
if (type.includes('long long')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (type.includes('*')) return typesToFfiTypes.pointer; | |
if (type.endsWith('_t')) return type.replace(/_t$/, ''); | |
if (type.includes('long long')) { | |
if (StringPrototypeIncludes(type, '*')) return typesToFfiTypes.pointer; | |
if (StringPrototypeEndsWith(type, '_t')) return StringPrototypeReplaceAll(type, '_t', ''); | |
if (StringPrototypeIncludes(type, 'long long')) { |
if (type.endsWith('_t')) return type.replace(/_t$/, ''); | ||
if (type.includes('long long')) { | ||
const size = sizes[type]; | ||
const signed = type.includes('unsigned'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const signed = type.includes('unsigned'); | |
const signed = StringPrototypeIncludes(type, 'unsigned'); |
lib = NULL; | ||
} | ||
const getReturnVal = getReader(ret); | ||
const argWriters = args.map(getWriter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const argWriters = args.map(getWriter); | |
const argWriters = ArrayPrototypeMap(args, getWriter); |
const getReturnVal = getReader(ret); | ||
const argWriters = args.map(getWriter); | ||
ret = normalizeTypeToFfiTypes(ret); | ||
args = args.map(normalizeTypeToFfiTypes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
args = args.map(normalizeTypeToFfiTypes); | |
args = ArrayPrototypeMap(args, normalizeTypeToFfiTypes); |
throw new ERR_FFI_SYMBOL_NOT_FOUND(funcName, lib === null ? 'null' : lib); | ||
} | ||
|
||
const sig = new FfiSignature(funcPtr, types[ret], args.map((n) => types[n])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const sig = new FfiSignature(funcPtr, types[ret], args.map((n) => types[n])); | |
const sig = new FfiSignature(funcPtr, types[ret], ArrayPrototypeMap(args, (n) => types[n])); |
This module is only available under the `node:` scheme. The following will not | ||
work: | ||
|
||
```mjs | ||
import ffi from 'ffi'; | ||
``` | ||
|
||
```cjs | ||
const ffi = require('ffi'); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this, and we should probably remove it from the node:test
docs as well.
@@ -0,0 +1,35 @@ | |||
'use strict'; | |||
const common = require('../common'); | |||
const assert = require('assert'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using node:
prefix to be consistant
const assert = require('assert'); | |
const assert = require('node:assert'); |
This issue/PR was marked as stalled, it will be automatically closed in 30 days. If it should remain open, please leave a comment explaining why it should remain open. |
Closing this because it has stalled. Feel free to reopen if this issue/PR is still relevant, or to ping the collaborator who labelled it stalled if you have any questions. |
@aduh95 @GeoffreyBooth @jasnell did this fall off the radar? A bot closing this PR seems like the wrong thing to happen here. |
@bengl we can add this behind a compile flag so it can be released as initial implementation and iterated on. It would be amazing to see it land |
@bengl do you think you will get back to this? If not, would you mind if someone else looked into it? |
Hey folks! Yes, it's been almost 2 years now, sorry about that. Life happened. I don't think I'll reasonably have time to tackle this one further. If others want to give it a go, please feel free! I'm happy to get on a Zoom call to discuss this with anyone, at length. For those wondering, the primary thing blocking this (apart from my time capacity) is that it depends on That's fine, but for the fact that Recently at NodeConfEU 2024, @marco-ippolito suggested (as indicated in the comment on Nov 2, 2024) to me that this could land behind a flag on just a few major platforms as experimental. I think that might work with the following caveats:
And all that said, it's totally possible that the approach I took in this PR is not the best. I modelled it after my attempt at a fast userland FFI library. The are other angles that could be taken here, like embedding a compiler, pre-compiling a bunch of common function signatures, using a different FFI library, or even (mad science hat on) leveraging V8's JIT compilation somehow. |
Here's a first pass at implementing a foreign function interface (FFI). It's roughly based on my attempt at building an FFI library in userland. That implementation uses
dyncall
, which doesn't support all the platforms that Node.js does, so I've usedlibffi
instead.As an example, this works on my PR branch.
Here is the new doc page for this.
Limitations
Open Questions
libffi
be updated/maintained?char *
arguments?