Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring autoload, project and static variable access speeds inline with script/function #8234

Open
elvisish opened this issue Oct 23, 2023 · 6 comments

Comments

@elvisish
Copy link

Describe the project you are working on

Benchmarking tool.

Describe the problem or limitation you are having in your project

The speed of accessing variables via autoloads, project settings and static variables is significantly slower than script or function variables, it would be beneficial if performance could be improved or documented that these methods of variable access were inherently slower.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

I would imagine GDScript 2.0 would make it easier to improve speed of variable access.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

I wrote a simple benchmarker to test the speed to access and change variables in different ways:
ElvisishBenchmarker-main.zip
https://github.com/elvisish/ElvisishBenchmarker

If this enhancement will not be used often, can it be worked around with a few lines of script?

I think it would be engine work.

Is there a reason why this should be core and not an add-on in the asset library?

I think it would be engine work.

@dalexeev
Copy link
Member

Test
@tool
extends EditorScript

static var static_var := 0
var member_var := 0

func _run() -> void:
    const N := 10_000_000

    var local_var := 0

    var start := Time.get_ticks_msec()
    for _i in N:
        static_var += 1
    prints("Static:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for _i in N:
        member_var += 1
    prints("Member:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for _i in N:
        local_var += 1
    prints("Local:", Time.get_ticks_msec() - start)
Static: 474
Member: 183
Local: 184

Accessing static variables is slower because additional opcodes are used to fetch from / store to a stack cell:

5: line 7:     static_var += 1
7: get_static_variable stack(5) = script(test.gd)["static_var"]
11: validated operator stack(4) = stack(5) + const(1)
16: assign stack(5) = stack(4)
19: set_static_variable script(test.gd)["static_var"] = stack(5)

23: line 8:    member_var += 1
25: validated operator stack(5) = member(member_var) + const(1)
30: assign member(member_var) = stack(5)

33: line 9:    local_var += 1
35: validated operator stack(4) = stack(3) + const(1)
40: assign stack(3) = stack(4)

See godotengine/godot#77129 (comment).

@Mickeon
Copy link

Mickeon commented Oct 24, 2023

Oh wow. I though the difference was going to be a bunch of milliseconds, but the difference here is staggering. The proposal could benefit from having some benchmark numbers upfront

@dalexeev
Copy link
Member

This is not a very representative test. If you measure read and write separately, the difference is less stunning. In real projects, this is not as critical as it seems.

Test write
@tool
extends EditorScript

static var static_var := 0
var member_var := 0

func _run() -> void:
    const N := 10_000_000

    var local_var := 0

    var start := Time.get_ticks_msec()
    for i in N:
        static_var = i
    prints("Static:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for i in N:
        member_var = i
    prints("Member:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for i in N:
        local_var = i
    prints("Local:", Time.get_ticks_msec() - start)
Static: 293
Member: 142
Local: 141
Test read
@tool
extends EditorScript

static var static_var := 0
var member_var := 0

func _run() -> void:
    const N := 10_000_000

    var local_var := 0
    var temp := 0

    var start := Time.get_ticks_msec()
    for i in N:
        temp = static_var
    prints("Static:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for i in N:
        temp = member_var
    prints("Member:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for i in N:
        temp = local_var
    prints("Local:", Time.get_ticks_msec() - start)
Static: 280
Member: 145
Local: 146

Compared to calling a GDScript function, this is a relatively cheap operation. For example, if in the test from my previous comment we replace += 1 with += get_one(), then we will get:

Test with GDScript function call
@tool
extends EditorScript

static var static_var := 0
var member_var := 0

func get_one() -> int:
    return 1

func _run() -> void:
    const N := 10_000_000

    var local_var := 0

    var start := Time.get_ticks_msec()
    for _i in N:
        static_var += get_one()
    prints("Static:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for _i in N:
        member_var += get_one()
    prints("Member:", Time.get_ticks_msec() - start)

    start = Time.get_ticks_msec()
    for _i in N:
        local_var += get_one()
    prints("Local:", Time.get_ticks_msec() - start)
Static: 1504
Member: 1231
Local: 1225

I have a few ideas on how to replace separated opcodes with addressing mode for static variables, but I'm afraid that this will slow down access to non-static and local variables. It is probably better to leave static variables less optimized, since they are usually used less frequently than others.

@elvisish
Copy link
Author

This is not a very representative test.

It's a matched read-and-write test, while nobody would be doing a million of these per frame (as this test is) some users may use ProjectSettings or Autoloads to store variables that would require checking every frame, such as a mouse sensitivity multiplier or even a walk speed, and there would be no reason to use a ProjectSettings global var, Autoload var or even static variable over a function or script variable for this as it would offer virtually no benefit (other than a minor convenience of not having to set a walk speed or mouse sensitivity setting on the player or the enemies each time a level is loaded) compared to definite worse performance.

There is often talk of premature optimisation thrown around for these kinds of tests and comparisons, but really it just encourages good groundwork practices (such as using typed variables and functions in GDScript 2.0 and caching node references). This could easily be considered premature optimisation but it's actually just good sense, and while none of this is going to make or break performance, why leave all optimising to the end of development when you can stack minor performance increases along the way.

If these alternative ways of getting and setting variables can't be improved, it should probably be documented that the other ways are technically faster and how they should be used (infrequently to reduce code reuse rather than frequently to reduce writing extra code).

@Calinou
Copy link
Member

Calinou commented Oct 26, 2023

Project settings should definitely not be read every frame. It's relatively slow and there is no type safety (it also relies on string keys). The engine tries to avoid this pattern as much as possible, preferring the use of a settings_changed signal to update an internal variable on changes.

Reading autoload variables every frame is fine, but they don't have to be static in this case.

@elvisish
Copy link
Author

elvisish commented Oct 26, 2023

Project settings should definitely not be read every frame. It's relatively slow and there is no type safety (it also relies on string keys). The engine tries to avoid this pattern as much as possible, preferring the use of a settings_changed signal to update an internal variable on changes.

That's good to know, I've just never seen it documented anywhere (I've actually seen a couple of assets that directly read settings like jump_height, walk_speed, mouse_sens, etc from project settings, which was one of the reasons I made this benchmarker).

Reading autoload variables every frame is fine, but they don't have to be static in this case.

But reading autoload variables every frame are still slower than script/function vars, it would always make more sense to cache them locally from the autoload to a secondary variable on the same script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants