Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CanvasItem notification thread guard #80752

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

bitsawer
Copy link
Member

Seems like simple fix, most other nodes like Node2D and Node3D also use ERR_THREAD_GUARD instead of ERR_MAIN_THREAD_GUARD in their notification callback. Looks like CanvasItem should also use ERR_THREAD_GUARD unless there is some specific reason why CanvasItem can't be used with subthread groups.

@bitsawer bitsawer added this to the 4.2 milestone Aug 18, 2023
@bitsawer bitsawer requested a review from a team as a code owner August 18, 2023 10:39
@akien-mga
Copy link
Member

It looks to me like this was done like this on purpose, not as a typo. See 0a9f72d where both Control and CanvasItem were made to use ERR_MAIN_THREAD_GUARD, with exceptions made explicit:

@@ -363,6 +379,7 @@ void CanvasItem::_window_visibility_changed() {
 }
 
 void CanvasItem::queue_redraw() {
+       ERR_THREAD_GUARD; // Calling from thread is safe.
        if (!is_inside_tree()) {
                return;
        }

So this should be discussed with @reduz.

@akien-mga akien-mga requested a review from reduz August 18, 2023 10:44
@RandomShaper
Copy link
Member

Given that functions like Node::add_child() use ERR_THREAD_GUARD, it seems that the design involves the ability to deal with the scene tree from threaded processing. It should be fine as long as the nodes involved that need to update something belong in the same group.

Therefore, my strong guess is that ERR_THREAD_GUARD is be fine here. Still, I'd like to be very safe and hear @reduz here.

@RandomShaper
Copy link
Member

Maybe this is a problem of granularity. CanvasItem::_notification() deals with stuff meant for the main thread (like visibility) and other that can be used from threads (like scene tree management). The safe thing was to use the most restrictive thread guard, but a possible better approach would be to use different kinds of guards depending on the specific notification; e.g., use ERR_THREAD_GUARD for NOTIFICATION_ENTER/EXIT_TREE, and ERR_MAIN_THREAD_GUARD for NOTIFICATION_VISIBILITY_CHANGED.

@reduz
Copy link
Member

reduz commented Sep 8, 2023

@akien-mga @bitsawer This is intended, GUI code (control derived) can't use thread groups. I wrote and explained this in detail in the guidelines regarding thread groups (they should probably be made into documentation I guess).
As such this PR should be closed.

@RandomShaper The guard you are mentioned in Node::add_child seems to be a mistake, the actual check is the line above. Same as in remove_child and a few others there. I think I added those before ERR_MAIN_THREAD_GUARD existed and forgot, so should probably be cleaned up.

@reduz
Copy link
Member

reduz commented Sep 8, 2023

Maybe the get_configuration_warning should be updated in Control to detect a thread group has been assigned and return an error.

@bitsawer
Copy link
Member Author

bitsawer commented Sep 8, 2023

I'm just making sure, but as Node2D inherits from CanvasItem, doesn't the current behavior basically mean that CanvasItem, Node2D or anything that inherits from them can't be used with thread groups? Currently, attaching a script with _process() causes a errors when attached to Node2D but not with Node3D if used with subthread groups.

I'm also wondering because Node2D has less strict _notification() thread guard (ERR_THREAD_GUARD) than its superclass CanvasItem (which uses ERR_MAIN_THREAD_GUARD), which feels weird. Control has ERR_MAIN_THREAD_GUARD which does make sense.

@reduz
Copy link
Member

reduz commented Sep 27, 2023

I am sorry, I misread the fix. This is perfect.

@RandomShaper
Copy link
Member

RandomShaper commented Sep 27, 2023

Are you sure that's the right type of guard for every possible notification? Quoting myself:

[...] a possible better approach would be to use different kinds of guards depending on the specific notification; e.g., use ERR_THREAD_GUARD for NOTIFICATION_ENTER/EXIT_TREE, and ERR_MAIN_THREAD_GUARD for NOTIFICATION_VISIBILITY_CHANGED.

Just to be sure.

Elaborating: CanvasItem::set_visibility() has ERR_MAIN_THREAD_GUARD, which is an indicator that dealing with visibility is not supported on threads. Therefore, NOTIFICATION_VISIBILITY_CHANGED shouldn't use a less restrictive guard.

@akien-mga
Copy link
Member

Reply from production chat:

<reduz> what pedro suggests makes sense but should probably done more fine grained in another PR I think
<Akien> Well I think bitsawer could do it if there's consensus that it's relevant. If Pedro is right, this PR might introduce issues where e.g. NOTIFICATION_VISIBLE is no longer safe for Controls.

So your call guys :)

@bitsawer
Copy link
Member Author

I think this PR is still worth merging, because currently using any 2D nodes with scripts using _process() is not really possible at all with thread groups (or at least easily), even if you know all threading rules. This PR at least makes it possible to do something until a better solution is implemented.

I also think that the best long-term solution will be to analyze each case and add more fine-grained access checks as already mentioned by RandomShaper. Ideally, the _notification() callbacks might not even need any access guard macros at the base level, each data access could in theory have its own check to make sure they have optimal granularity. Of course, this might be a pain to implement in practice, so some generalization might be unavoidable.

@RandomShaper
Copy link
Member

I quite don't see why not apply what I'm suggesting right in this PR and so have both the functionality and safety benefits. For me, that'd be the ground on top of which we could build something better. That said, I wouldn't oppose to it being merged, but I may do a follow-up PR with my take... 😀

@bitsawer
Copy link
Member Author

Oh, I kind of missed that you were suggesting some additions to this PR, makes sense if you want to add something here. I think a more extensive follow up to add or tweak thread guards with some more thought is a good idea anyway, but I can add stuff here if you want to, feel free to hold my hand here.

I guess you for example suggested ERR_THREAD_GUARD for CanvasItem NOTIFICATION_ENTER/EXIT_TREE, but doesn't basically all scene tree related manipulation require ERR_MAIN_THREAD_GUARD? I'm not all that familiar with this system yet, so you might be better at figuring out the level of checks needed.

@RandomShaper
Copy link
Member

Scene tree manipulation methods (e.g., Node::add_child()) are guarded by ERR_THREAD_GUARD, so they are not meant to be used during threaded processing (by design; maybe technically "overcomable," but would surely make the implementation more difficult). The same holds for visibility manipulation. That's why I think this kind of guard should be used for the corresponding notifications, while one allowing threaded processing should be used in those notifications that can be received as a result of legal operations during threaded processing.

However, I've just realized there's more to it. Neither CanvasItem nor Node2D have non-read-only methods that can be used during threaded processing. As a result, we'd indeed put a ERR_THREAD_GUARD in their notification callback and call it a day. That would of course bring back the issue you found. The problem is that they can be getting notifications that are harmless during threaded processing since they are just ignoring them. Because of that, my current take is to use ERR_THREAD_GUARD on each case of the switch, but not in the scope of the function to let other notifications flow without interruption. If another class down in the hierarchy happens to handle one of those, it's up to that class to decide the safety level and therefore decorate the relevant case with the appropriate guard.

@akien-mga
Copy link
Member

So what's the consensus? The merge window for this kind of "risky" change for 4.2 will soon close as we enter the beta phase and try to play it safe.

@RandomShaper
Copy link
Member

Summarizing my take:

My current take is to use ERR_THREAD_GUARD on each case of the switch, but not in the scope of the function, to let other notifications flow without interruption, plus moving to a per-switch scheme as well in any other existing notification callbacks using thread guards.

Do @bitsawer and @reduz agree?

@bitsawer
Copy link
Member Author

bitsawer commented Oct 2, 2023

Adding ERR_THREAD_GUARD to each CanvasItem::_notification() switch case and removing the current ERR_MAIN_THREAD_GUARD from the base function scope makes sense to me. We can probably fine tune them later if we need to.

Should I do the same with Node2D::_notification() or should we leave it alone for now? It currently matches Node3D::_notification() as they both have ERR_THREAD_GUARD in function base scope.

I can go through all existing _notification() callbacks that have existing thread guards and do the same. Fortunately, there doesn't seem to be too many of them yet, looks its basically only CanvasItem, Node2D and Node3D that currently have these thread checks in their notification callback.

@RandomShaper
Copy link
Member

I can go through all existing _notification() callbacks that have existing thread guards and do the same. Fortunately, there doesn't seem to be too many of them yet, looks its basically only CanvasItem, Node2D and Node3D that currently have these thread checks in their notification callback.

Oh, you've already done. You're my hero!

But I think the right kind of guard would be ERR_MAIN_THREAD_GUARD since all the scene tree management and visibility changing functionality is not meant to be dealt with from threads. NOTIFICATION_TRANSFORM_CHANGED is the only one that I believe warrants ERR_THREAD_GUARD.

@bitsawer
Copy link
Member Author

bitsawer commented Oct 2, 2023

Updated most guards to use ERR_MAIN_THREAD_GUARD and pushed changes, except for Node3D NOTIFICATION_TRANSFORM_CHANGED. I played a bit safe not to change the existing logic much, but I guess we might as well go all in. What do you think about CanvasItem NOTIFICATION_VISIBILITY_CHANGED, in theory ERR_THREAD_GUARD might also be enough for it?

edit:

Slightly off topic, but I think currently some guards might also trigger for example when loading scenes using ResourceLoader.load_threaded_request(). This might be tricky if it creates or uses thread you can't control and call Thread.set_thread_safety_checks_enabled(false), for example in #81472. I wonder we should also call set_thread_safety_checks_enabled(false) in resource loading threads, or shuold we add some logic to handle these cases...

@RandomShaper
Copy link
Member

I think it's perfect now.

What do you think about CanvasItem NOTIFICATION_VISIBILITY_CHANGED, in theory ERR_THREAD_GUARD might also be enough for it?

I think so. CanvasItem::set_visibility is guarded by ERR_MAIN_THREAD_GUARD, which hints that visibility is not meant to be dealt with during threaded processing. It would be hard to enable such a thing, and it's probably not worth it. After all, if you really need to touch some node's visibility during threaded processing you can always use call/set_thread_safe() for it to happen when it's safe afterwards.

@akien-mga akien-mga merged commit 3a990e3 into godotengine:master Oct 3, 2023
15 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Node thread group: _notification can only be accessed from main thread error spam
4 participants