Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose local RenderingDevice creation to RenderingServer #42130

Merged
merged 1 commit into from
Sep 23, 2020

Conversation

clayjohn
Copy link
Member

Marked as draft because I don't think that this is the way that @reduz wants to expose RenderingDevice. Also I am not sure why this works, I feel like it shouldnt.

Anyway, I am opening this PR in case others want to play around with compute shaders from GDscript.

Here is a minimal example of a compute shader.

in a GDScript _ready() function

var shader_file = load("res://compute.glsl")
var shader_bytecode = shader_file.get_bytecode()
var rd = RenderingServer.create_local_rendering_device()
var shader = rd.shader_create(shader_bytecode)
var pipeline = rd.compute_pipeline_create(shader)

var storage_buffer = rd.storage_buffer_create(64)
var u = RDUniform.new()
u.type = RenderingDevice.UNIFORM_TYPE_STORAGE_BUFFER
u.binding = 0 
u.add_id(storage_buffer)
var uniform_set = rd.uniform_set_create([u], shader, 0)
	
var compute_list = rd.compute_list_begin()
rd.compute_list_bind_compute_pipeline(compute_list, pipeline)
rd.compute_list_bind_uniform_set(compute_list, uniform_set, 0)
rd.compute_list_dispatch(compute_list, 8, 1, 1)
rd.compute_list_end()

In a file named "compute.glsl"

#[compute]

#version 450

layout(local_size_x = 8) in;

layout(set = 0, binding = 0, std430) restrict buffer ColorBuffer {
  float data[];
}
color_buffer;

void main() { 
    color_buffer.data[gl_GlobalInvocationID.x] = 1.0; 
}

@clayjohn clayjohn requested a review from reduz September 17, 2020 04:16
@clayjohn clayjohn added this to the 4.0 milestone Sep 17, 2020
@clayjohn clayjohn added bug and removed discussion labels Sep 17, 2020
@clayjohn clayjohn marked this pull request as ready for review September 17, 2020 19:04
@akien-mga akien-mga merged commit 7b3759a into godotengine:master Sep 23, 2020
@akien-mga
Copy link
Member

Thanks!

@clayjohn clayjohn deleted the RenderingDeviceHack branch February 12, 2021 16:34
@ChristopheClaustre
Copy link
Contributor

ChristopheClaustre commented Jul 12, 2021

Hi fellow devs.

Maybe I am missing something but I think this is broken right now (don't know if I should answer here or open an issue).

I added this after your _ready() example:

var data = rd.buffer_get_data(storage_buffer)
print("size:" + str(data.size()))
print("data[0]:" + str(data[0]))
print("data[63]:" + str(data[63]))

Output:

size:64
data[0]:0
data[63]:0

If I debug, all the 64 elements of the PackedByteArray are filled with zeros.
I tried to change the size of storage_buffer and it did change the size of my PackedByteArray but still filled with zeros.

@clayjohn
Copy link
Member Author

@ChristopheClaustre

I can see 2 potential issues:

  1. The GPU may not have finished writing the data before you read it back. You may need to add some idle time in between when you submit the GPU command and read it. Remember, GPU and CPU are async, so you can't count on a result being available immediately
  2. GDScript has received a lot of changes and I'm not sure what the implicit byte -> float conversion looks like. If this is the problem, you can try using integers in the shader to see if the conversion goes smoother.

@ChristopheClaustre
Copy link
Contributor

For the GPU and CPU async, I have the code on my PC and the rd.compute_list_end() seams to wait for GPU (here).

I will try to wait before fetching results and to use integers.
I will keep you informed.

@ChristopheClaustre
Copy link
Contributor

Some news : It works !

First, I was having trouble with the godotscript byte decoding API (reading the C++ directly was simpler).
Then, you were right about timing, waiting one frame before fetching seams mandatory. I suppose that Godot submit all the orders to Vulkan at the end of the frame and so I can't get my result earlier (what about the barrier ?). If I am right, I feel it must not work that way for compute shader but Godot must have its reasons (alpha ?).

Thank you for your time @clayjohn.

@clayjohn
Copy link
Member Author

@ChristopheClaustre Great to hear!

Indeed, it is still difficult to deal with bytes directly in GDScript, I think there have been some improvements recently, but we will need to do more work to make it easy.

Regarding the frame timing. The GPU and CPU run out of sync with one another. This means that the CPU code will continue running without waiting for the GPU code to finish. This is one of the benefits of using compute shaders, but it means you need to be careful about when you retrieve your data from the GPU.

In the future, we will be adding documentation to make this clear, and we may add more to the API to do things like force the CPU to stall until data is returned.

@tito91
Copy link

tito91 commented Sep 12, 2021

Hello!
I'm trying to figure out why my output is wrong while running this example. I'm using most current sources. Only changes I have made was adapting to API changes and replacing get_bytecode() with get_spirv() and shader_create with shader_create_from_spirv. I've also changed u.type with u.uniform_type. Now the problem is, my output of 64 bytes is filled with repeating pattern being four values: [0, 0, 128, 63]. When debugging C++, I can see the same exact values on C++ side, so it's not any conversion issue, I guess. What are my options now, can I use RenderDoc on exported project to debug this further?

@clayjohn
Copy link
Member Author

@tito91 it sounds like you need to convert groups of 4 bytes into a float! One way to do that is use the function to_float32_array on your packedbytearray

@tito91
Copy link

tito91 commented Sep 13, 2021

@clayjohn Oh wow, that's a bummer. It works like a charm now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants