-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: provide Pinner API for object pinning #46787
Comments
From what I can tell from the documentation for the new Edit: Wait, that doesn't make sense. Could you just use // void doSomethingWithAPointer(int *a);
import "C"
func main() {
v := C.int(3)
h := cgo.NewHandle(&v)
doSomethingWithAPointer(&v) // Safe because the handle exists for that pointer.
h.Delete()
} Alternatively, if that's not feasible, what about a method on // Pointer returns a C pointer that points to the underlying value of the handle
// and is valid for the life of the handle.
func (h Handle) Pointer() C.uintptr_t Disclaimer: I'm not familiar enough with the internals of either the Go garbage collector or Cgo to know if either of these even make sense. |
@DeedleFake As you pointed out yourself, the |
An big advantage of the current cgo mechanisms, including It's hard to say more without a specific API to discuss. If you suggested one, my apologies for missing it. |
@ianlancetaylor thanks for taking the time to answer.
I agree, that is an advantage. However, with go routines it's trivial to fire-and-forget thousands of such function calls, that never return.
I didn't describe a specific API, that's true. I hoped that this could be developed here together once we agreed on the requirements. One of the requirements that I mentioned was, that the pinning happens only for the current scope. That implies automatic unpinning when the scope is left. Sorry that I didn't make that clear enough. So, to rephrase more compactly, the requirements would be:
As stated above, I didn't want to suggest a specific API, but characteristics of it. In the end it could be a function like func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
numberOfBuffers := len(bufferArray)
iovec := make([]C.struct_iovec, numberOfBuffers)
for i := range iovec {
bufferPtr := unsafe.Pointer(&bufferArray[i][0])
runtime.PtrEscapes(bufferPtr) // <- pins the pointer and makes it known to escape to C
iovec[i].iov_base = bufferPtr
iovec[i].iov_len = C.size_t(len(bufferArray[i]))
}
n := C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
// ^^^ cgocheck doesn't complain, because Go pointers in iovec are pinned
return int(n) // <- all pinned pointers in iovec are unpinned
} As long as the GC is not moving, Regarding footguns I'm pretty sure, that the workarounds, that have to be used at the moment to solve these problems, will cause more "programs to silently behave badly" than the potential abuse of a proper pinning method. |
Drawing from package runtime
// Pin prevents the object to which p points from being relocated until
// the returned PointerPin either is unpinned or becomes unreachable.
func Pin[T any](p *T) PointerPin
type PointerPin struct {…}
func (p PointerPin) Unpin() {} Then the example might look like: func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
numberOfBuffers := len(bufferArray)
iovec := make([]C.struct_iovec, numberOfBuffers)
for i := range iovec {
bufferPtr := unsafe.Pointer(&bufferArray[i][0])
defer runtime.Pin(bufferPtr).Unpin()
iovec[i].iov_base = bufferPtr
iovec[i].iov_len = C.size_t(len(bufferArray[i]))
}
n := C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
return int(n)
} A |
@ansiwen when you write "automatic unpinning when the current scope is left (the current function returns)" the current scope you refer to is the scope of the Go function correct? In your example that would be @bcmills version also looks very natural flowing to me, and in that version it's clear that the pointer would be pinned until the defer at the end of |
@phlogistonjohn Yes, exactly.
Yes, I also would prefer @bcmills version from a user's perspective, because it's more explicit and it's basically the same API that we use with PtrGuard. I just don't know enough about the implications on the implementation side and effects on the Go internals, so I don't know what API would be more feasible. My proposal is about providing an official way to solve the described problem. I really don't care so much about the "form", that is how exactly the API looks like. Whatever works best with the current Go and Cgo implementation. 😊 |
@bcmills I guess, an argument @ianlancetaylor might bring up against your API proposal is, that it would allow to store the |
So, if you want to enforce the unpinning, the only strict RAII pattern in Go that I could come up with is using a scoped constructor like this API: package runtime
// Pinner is the context for pinning pointers with Pin()
// can't be copied or constructed outside a Pinner scope
type Pinner struct {…}
// Pin prevents the object to which p points from being relocated until
// Pinner becomes invalid.
func (Pinner) Pin(p unsafe.Pointer) {...}
func WithPinner(func(Pinner)) {...} which would be used like this: func ReadFileIntoBufferArray(f *os.File, bufferArray [][]byte) int {
numberOfBuffers := len(bufferArray)
iovec := make([]C.struct_iovec, numberOfBuffers)
var n C.ssize_t
runtime.WithPinner(func (pinner runtime.Pinner) {
for i := range iovec {
bufferPtr := unsafe.Pointer(&bufferArray[i][0])
pinner.Pin(bufferPtr)
iovec[i].iov_base = bufferPtr
iovec[i].iov_len = C.size_t(len(bufferArray[i]))
}
n = C.readv(C.int(f.Fd()), &iovec[0], C.int(numberOfBuffers))
}) // <- All pinned pointers are released here and pinner is invalidated (in case it's copied out of scope).
return int(n)
} I personally would prefer a thinner API, where either it must be explicitly unpinned, like in the proposal of @bcmills, or - even better - the pinning implicitly creates a defer for the scope in which the pinning function has been called from. Given, that this will be implemented in the runtime package, I guess there are tricks and magic that can be used there. |
@ansiwen Even with the Personally, as long as the |
The "keeping-around" can easily be prevented by one pointer indirection that get's invalidated when the scope is left. You can have a look at my implementation of PtrGuard that even has test case for exactly the case of a scope escaping variable.
Yeah, I agree, as I wrote before, I'm totally fine with both. It's just something I came up with to address @ianlancetaylor's concerns. I also think that the risks are "manageable", there are all kinds of other risks when dealing with runtime and/or unsafe packages after all. |
I think that the API proposed by @bcmills is the most useful one. Although there is a risk of forgetting to unpin a pointer, once Go gets a moving garby collector, for certain low level uses, certain blocks of memory will have to stay pinned for the duration of the program. Certainly for system calls in Linux, such as for the frame buffers. In other words, Pin and Unpin are also useful without cgo. |
Hi @rsc, any updates on this issue recently? I noticed it has been several days after the 2021-08-04's review meeting minutes. |
The compiler/runtime team has been talking a bit about this but don't have any clear suggestions yet. The big problem with pinning is that if we ever want a moving garbage collector in the future, pins will make it much more complex. That's why we've avoided it so far. /cc @aclements |
@rsc But my point in the description was, that we have pinning already when C functions are called with Go pointers or when the |
@rsc I would say the converse is also true. If you are going to implement a moving garbage collector without support for pinning, that will make it much more complex to use Go for certain direct operating calls without cgo, e.g. on Linux. |
Unbounded pinning has the potential to be significantly worse than bounded pinning. If people accidentally or intentionally leave many pointers pinned, that can fragment the spaces that the GC uses, and make it very hard for a moving GC to make any progress at all. This can in principle happen with cgo today, but it is unlikely that many programs pass a bunch of pointers to a cgo function that never returns. When programmers control the pinning themselves, bugs are more likely. If the bug is in some imported third party library, the effect will be strange garbage collection behavior for the overall program. This will be hard to understand and hard to diagnose, and it will be hard to find the root cause. (One likely effect will be a set of tools similar to the memory profiler that track pinned pointers.) It's also worth noting that we don't have a moving garbage collector today, so any problems that pinned pointers may introduce for a moving garbage collector will not be seen today. So if we ever do introduce a moving garbage collector, we will have a flag day of hard-to-diagnose garbage collection problems. This will make it that much harder to ever change the garbage collector in practice. So I do not think the current situation is nearly as complex as the situation would be if we add unbounded pinning. This doesn't mean that we shouldn't add unbounded pinning. But I think that it does mean that the argument for it has to be something other than "we can already pin pointers today." |
@ianlancetaylor That is fair enough. But then it seems to me the best way ahead is to put this issue on hold until we can implement a prototype moving garbage collector. There is always a workaround if there is no pinning available and that is to manually allocate memory directly from the OS so the GC doesn't know about it. It is not ideal but it can work. |
Yeah, one workaround that is missing from the discussion is hiding the C api allocation concerns, e.g. iovec could be implemented like:
Or in other words, from the problem statement, it's unclear why it's required to use |
Let's separate the two questions "pinning yes/no" and "pinning bounded/unbounded". pinning yes/noI also proposed
Both provide a similar behaviour as the pinning bounded/unbounded
For me personally the first question is more important. Bounded or unbounded, I think the existing and required ways of pinning should be made less hacky in their usage.
The |
In some manner, yes.
A GC that is based on moving pointers is not the same as a GC that does not move pointers. A GC based on moving pointers may be completely blocked by a pinned pointer, whereas for a non-moving GC a pinned pointer is just the same as a live pointer.
Same answer. Again, all I am saying is that arguments based on "we already support pinned pointers, so it's OK to add more" are not good arguments. We need different arguments. |
How would we deal with the
I'm afraid that would badly impact the GC latency or something else if it is true. Please consider the disk i/o syscall that may block a very long time. |
Since you agreed that the pinning is required in the answer before, I don't understand how such an implementation could be used in Go.
I don't think "add more" is the right wording. It's more about exposing the pinning in a better way. And these are not arguments for doing it, but arguments against the supposed risks of doing it. The argument for doing it should be clear by now: give people a zero-copy way to use APIs like In your answers you skipped the first part about the bounded pinning. If you have the time to comment on these too, I would be very interested. 😊 |
The current system for pinning pointers doesn't permit pointers to be pinned indefinitely, if we discount the unusual case of a C function that does not return. I agree that other systems that somehow ensure that pointers can't be pinned indefinitely are better. (I don't think that an implicit |
@aclements thanks for your answer. Maybe there is a misunderstanding? My suggestion would have identical efficiency, IIUC, because both our proposals would use the bitmap for the "isPinned()" question. The difference would only be, if the pin counters are stored in the specials list, or in another data structure like a map, which is only accessed by Pin/Unpin functions. So efficiency regarding the gocheck code would be the same for bitmap+specials and bitmap+map. Right?
My proposal would be exactly the same, only that it stores the counter in a map, not in the specials list. If there is a good reason, I can do the specials-list approach, but I though a map-based solution would be less error prone.
That sounds good.
Right.
Yeah, I agree. So the last open question is then: bitmap+(counter in mutex+map) or bitmap+(counter in specials-list). As I said before, I think mutex+map is less error prone and the performance difference for Pin/Unpin should not matter. But I'm happy to be convinced otherwise and go for the specials-list. Thanks |
So, I went ahead and implemented multi-pinning with both approaches, a version with a global mutexed map, and then a version based on specials. Both seem to work fine, so I pushed the specials-based version, because that apeared to be @aclements preference. I still have several questions about some code which I added as comments in the latest patch. PTAL: https://go-review.googlesource.com/c/go/+/367296 |
You mentioned being unable to use |
No, a |
Here are some benchmarks. Certainly they are not representative for real-world-code, but there are some differences. As expected the specials implementation has a higher baseline, but becomes more efficient, when there are multi-pinnings, especially parallel ones. Overall the difference is not big, and for the "common" use cases the map solution is even a bit better. However, the specials solution could be further tuned for the common cases. specials:
map:
|
Updated version beats the maps version in almost all scenarios: specials (improved):
|
Hey, sorry for the delay. I'm looking at your CL now but I'm having difficulty verifying that CL against the set of semantics that was actually accepted as part of this proposal. This is mostly because I'm trying to piece it together from this GitHub issue, which isn't going super well. 😅 If they are written down and I just missed it, please point me at it! In the meantime, here's what the CL's semantics are, AFAICT:
Does this sound right? Am I missing anything? Thanks. EDIT: I updated this on 2023-05-17 to clarify a couple points. |
Change https://go.dev/cl/496193 mentions this issue: |
The new Pinner API's implementation imposes some partial-orders that are safe but previously did not exist between a mspanSpecial, mheapSpecial, and mheap. Fix that up in the lock ranking. For #46787. Change-Id: I51cc8f7f069240caeb44d749bed43515634f4814 Reviewed-on: https://go-review.googlesource.com/c/go/+/496193 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com>
Change https://go.dev/cl/497615 mentions this issue: |
This change caches the *pinner on the P to pool it and reduce the chance that a new allocation is made. It also makes the *pinner no longer drop its refs array on unpin, also to avoid reallocating. The Pinner benchmark results before and after this CL are attached at the bottom of the commit message. Note that these results are biased toward the current change because of the last two benchmark changes. Reusing the pinner in the benchmark itself achieves similar performance before this change. The benchmark results thus basically just confirm that this change does cache the inner pinner in a useful way. Using the previous benchmarks there's actually a slight regression from the extra check in the cache, however the long pole is still setPinned itself. name old time/op new time/op delta PinnerPinUnpinBatch-8 42.2µs ± 2% 41.5µs ± 1% ~ (p=0.056 n=5+5) PinnerPinUnpinBatchDouble-8 367µs ± 1% 350µs ± 1% -4.67% (p=0.008 n=5+5) PinnerPinUnpinBatchTiny-8 108µs ± 0% 102µs ± 1% -6.22% (p=0.008 n=5+5) PinnerPinUnpin-8 592ns ± 8% 40ns ± 1% -93.29% (p=0.008 n=5+5) PinnerPinUnpinTiny-8 693ns ± 9% 39ns ± 1% -94.31% (p=0.008 n=5+5) PinnerPinUnpinDouble-8 843ns ± 5% 124ns ± 3% -85.24% (p=0.008 n=5+5) PinnerPinUnpinParallel-8 1.11µs ± 5% 0.00µs ± 0% -99.55% (p=0.008 n=5+5) PinnerPinUnpinParallelTiny-8 1.12µs ± 8% 0.00µs ± 1% -99.55% (p=0.008 n=5+5) PinnerPinUnpinParallelDouble-8 1.79µs ± 4% 0.58µs ± 6% -67.36% (p=0.008 n=5+5) PinnerIsPinnedOnPinned-8 5.78ns ± 0% 5.80ns ± 1% ~ (p=0.548 n=5+5) PinnerIsPinnedOnUnpinned-8 4.99ns ± 1% 4.98ns ± 0% ~ (p=0.841 n=5+5) PinnerIsPinnedOnPinnedParallel-8 0.71ns ± 0% 0.71ns ± 0% ~ (p=0.175 n=5+5) PinnerIsPinnedOnUnpinnedParallel-8 0.67ns ± 1% 0.66ns ± 0% ~ (p=0.167 n=5+5) name old alloc/op new alloc/op delta PinnerPinUnpinBatch-8 20.1kB ± 0% 20.0kB ± 0% -0.32% (p=0.008 n=5+5) PinnerPinUnpinBatchDouble-8 52.7kB ± 0% 52.7kB ± 0% -0.12% (p=0.008 n=5+5) PinnerPinUnpinBatchTiny-8 20.1kB ± 0% 20.0kB ± 0% -0.32% (p=0.008 n=5+5) PinnerPinUnpin-8 64.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5) PinnerPinUnpinTiny-8 64.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5) PinnerPinUnpinDouble-8 64.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5) PinnerPinUnpinParallel-8 64.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5) PinnerPinUnpinParallelTiny-8 64.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5) PinnerPinUnpinParallelDouble-8 64.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5) PinnerIsPinnedOnPinned-8 0.00B 0.00B ~ (all equal) PinnerIsPinnedOnUnpinned-8 0.00B 0.00B ~ (all equal) PinnerIsPinnedOnPinnedParallel-8 0.00B 0.00B ~ (all equal) PinnerIsPinnedOnUnpinnedParallel-8 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta PinnerPinUnpinBatch-8 9.00 ± 0% 8.00 ± 0% -11.11% (p=0.008 n=5+5) PinnerPinUnpinBatchDouble-8 11.0 ± 0% 10.0 ± 0% -9.09% (p=0.008 n=5+5) PinnerPinUnpinBatchTiny-8 9.00 ± 0% 8.00 ± 0% -11.11% (p=0.008 n=5+5) PinnerPinUnpin-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) PinnerPinUnpinTiny-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) PinnerPinUnpinDouble-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) PinnerPinUnpinParallel-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) PinnerPinUnpinParallelTiny-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) PinnerPinUnpinParallelDouble-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) PinnerIsPinnedOnPinned-8 0.00 0.00 ~ (all equal) PinnerIsPinnedOnUnpinned-8 0.00 0.00 ~ (all equal) PinnerIsPinnedOnPinnedParallel-8 0.00 0.00 ~ (all equal) PinnerIsPinnedOnUnpinnedParallel-8 0.00 0.00 ~ (all equal) For #46787. Change-Id: I0cdfad77b189c425868944a4faeff3d5b97417b9 Reviewed-on: https://go-review.googlesource.com/c/go/+/497615 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Ansiwen <ansiwen@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com>
Change https://go.dev/cl/498116 mentions this issue: |
With the introduction of runtime.Pinner, we need to update the cgo pointer passing rules to accomodate the new functionality. These rule changes are easier to describe if the rest of the pointer passing rules are described in terms of pinning as well (Go memory is implicitly pinned when a pointer to it is passed to a C function, and implicitly unpinned when that function returns). For #46787. Change-Id: I263f03412bc9165f19c9ada72fb005ed0483a8ee Reviewed-on: https://go-review.googlesource.com/c/go/+/498116 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com>
Change https://go.dev/cl/527702 mentions this issue: |
With the introduction of runtime.Pinner, returning a pointer to a pinned struct that then points to an unpinned Go pointer is correctly caught. However, the error message remained as "cgo result has Go pointer", which should be updated to acknowledge that Go pointers to pinned memory are allowed. This also updates the comments for cgoCheckArg and cgoCheckResult to similarly clarify. Updates #46787 Change-Id: I147bb09e87dfb70a24d6d43e4cf84e8bcc2aff48 GitHub-Last-Rev: 706facb GitHub-Pull-Request: #62606 Reviewed-on: https://go-review.googlesource.com/c/go/+/527702 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
Update, 2021-10-20: the latest proposal is the API in #46787 (comment).
Problem Statement
The pointer passing rules state:
and
There are C APIs, most notably the
iovec
based ones for vectored I/O which expect an array of structs that describe buffers to read to or write from. The naive approach would be to allocate both the array and the buffers withC.malloc()
and then either work on the C buffers directly or copy the content to Go buffers. In the case of Go bindings for a C API, which is assumably the most common use case for Cgo, the users of the bindings shouldn't have to deal with C types, which means that all data has to be copied into Go allocated buffers. This of course impairs the performance, especially for larger buffers. Therefore it would be desirable to have a safe possibility to let the C API write directly into the Go buffers. This, however, is not possible becauseObviously, what is missing is a safe way to pin an arbitrary number of Go pointers in order to store them in C memory or in passed-to-C Go memory for the duration of a C call.
Workarounds
Break the rules and store the Go pointer in C memory
(click)
with something like
but
GODEBUG=cgocheck=2
would catch that.However, you can circumvent cgocheck=2 with this casting trick:
This might work, as long as the GC is not moving the pointers, which might be a fact as of now, but is not guaranteed.
Break the rules and hide the Go pointer in Go memory
(click)
with something like
Again: This might work, as long as the GC is not moving the pointers.
GODEBUG=cgocheck=2
wouldn't complain about this.Break the rules and temporarily disable
cgocheck
(click)
If hiding the Go pointer as a uintptr like in the last workaround is not possible, passing Go memory that contains Go pointers usually bails out because of the default
cgocheck=1
setting. It is possible to disable temporarilycgocheck
during a C call, which can especially useful, when the pointer have been "pinned" with one of the later workarounds. For example the_cgoCheckPtr()
function, that is used in the generated Cgo code, can be shadowed in the local scope, which disables the check for the following C calls in the scope:Maybe slightly more robust, is to export the runtime.dbgvars list:
Use a C function to store the Go pointer in C memory
(click)
The rules allow that a C function stores a Go pointer in C memory for the duration of the call. So, for each Go pointer a C function can be called in a Go routine, that stores the Go pointer in C memory and then calls a Go function callback that waits for a release signal. After the release signal is received, the Go callback returns to the C function, the C function clears the C memory from the Go pointer, and returns as well, finishing the Go routine.
This approach fully complies with the rules, but is quite expensive, because each Go routine that calls a C function creates a new thread, that means one thread per stored Go pointer.
Use the
//go:uintptrescapes
compiler directive(click)
//go:uintptrescapes
is a compiler directive thatSo, similar to the workaround before, a Go function with this directive can be called in a Go routine, which simply waits for a release signal. When the signal is received, the function returns and sets the pointer free.
This seems already almost like a proper solution, so that I implemented a package with this approach, that allows to
Pin()
a Go pointer andPoke()
it into C memory: PtrGuardBut there are still caveats. The compiler and the runtime (cgocheck=2) don't seem to know about which pointers are protected by the directive, because they still don't allow to pass Go memory containing these Go pointers to a C function, or to store the pointers in C memory. Therefore the two first workarounds are additionally necessary. Also there is the small overhead for the Go routine and the release signalling.
Proposal
It would make Cgo a lot more usable for C APIs with more complex pointer handling like
iovec
, if there would be a programmatic way to provide what//go:uintptrescapes
provides already through the backdoor. There should be a possibility to pin an arbitrary amount of Go pointers in the current scope, so that they are allowed to be stored in C memory or be contained in Go memory that is passed to a C function within this scope, for example with aruntime.PtrEscapes()
function. It's cumbersome, that it's required to abuse Go routines, channels and casting tricks in order provide bindings to such C APIs. As long as the Go GC is not moving pointers, it could be a trivial implementation, but it would encapsulate this knowledge and would give users a guarantee.I know from the other issues and discussions around this topic that it's seen as dangerous if it is possible to pin an arbitrary amount of pointers. But
//go:uintptrescapes
functions, therefore it is also possible to pin arbitrary amount of Go pointers already.Related issues: #32115, #40431
/cc @ianlancetaylor @rsc @seebs
edit: the first workaround had an incorrect statement.
edit 2: add workarounds for disabling cgocheck
The text was updated successfully, but these errors were encountered: