-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Provide machine readable API definitions with SDL3 #6337
Comments
Conceptually this is fine with me, as long as it doesn't decrease readability of the headers by end users. If it does, then I would suggest a separate API definition file that's machine readable. Can you give a sample of what a small header like SDL_sensor.h might look like? |
Just a heads up: I'm working on that, just happens that i'm at a conference right now. Will definitly post results next week |
GNOME's GObject-Introspection is in the same general space as this, and GNOME-adjacent libraries use it to generate bindings, either at compile-time for compiled languages (Vala, C++, Rust) or at runtime for dynamic languages (Python, JavaScript, Perl). SDL probably can't usefully use GObject-Introspection directly, because GObject-Introspection is designed for GLib's object model, but it's worth looking at GObject-Introspection and seeing what sort of information they needed in order to autogenerate the bindings. It uses magic comments containing annotations; the most important one is usually Another very useful annotation is whether a
In GObject-Introspection, this distinction would be something like:
|
My proposal wouldn't go that far, but especially wouldn't use C as a data ground truth. I hopefully can finish my example later this day, as the above code doesn't contain even remotely enough data to generate nice Zig or C# code. Ownership transfer is a good point, though! |
One insight from GNOME which might be equally useful in SDL is that the most convenient API/ABI for C is not necessarily convenient for bindings. A reasonable number of API entry points in GLib/GTK end up having two versions: one that is convenient for C programmers and marked as not visible to bindings (for example using varargs), and one that is convenient for binding programmers but de-emphasized for C programmers (for example always using an (array,length) pair even if that's not the most natural C representation). Usually one of them calls the other internally, or they both call into a common internal implementation. |
@slouken: I created a example here: https://github.com/MasterQ32/SDL3-Api-Generator-Example It implements the minimal stuff to render the Sensor API to both Zig and C. The generated code is not at the level that i want to generate, but it's pretty close. One thing that's missing still is the ability to abstract something like function macros, which are not part of the linked api, but the compiled-in api. One cool thing that is possible: Important note: I chose Lua for implementation just because it allows for a quick-and-dirty implementation. For a official API generator, i'd probably move to C, as we can remove dependencies by that.
That is true. I think we can model something like that.
This one is actually a pretty cool idea. Your comments aren't incorporated yet into the API generator/data format, but it should not be that hard. Array lengths are also a pretty cool annotation, would allow Zig users to use slices ( |
Maybe my binding generators are of interest to the discussion, outlined here: https://floooh.github.io/2020/08/23/sokol-bindgen.html TL;DR: I'm running my C headers through clang ast-dump, parse the resulting JSON output into a reduced 'intermediate JSON', and then generate language bindings from this (now automated via Github Actions: https://github.com/floooh/sokol/actions/runs/3122773475) Depending on target language I'm injecting special cases (e.g. helper functions like this: https://github.com/floooh/sokol-zig/blob/680d37ebcde09794e66380ff30867ca3dafb9f2f/src/sokol/gfx.zig#L4-L26). I think it's important to be able to allow the final code generator to support special treatment for specific declarations, for instance printf()-like functions with variable argument lists usually can't be mapped directly to the target language. For such 'complicated cases' I don't attempt to find a generic solution, but simply inject a manually written function (in some cases not even calling the original function, but 'emulating' it in the target language, for instance here's such a 'formatted print replacement': https://github.com/floooh/sokol-zig/blob/680d37ebcde09794e66380ff30867ca3dafb9f2f/src/sokol/debugtext.zig#L29-L50 Clang ast-dump works ok for my case, because I can control the input C APIs (there's a blurb about "binding friendly APIs" in the blog post). The ast-dump output format isn't guaranteed to remain fixed, but so far (for just parsing declarations) it hasn't changed. A more robust solution is proabably a "proper" tool based on libclang. In any case, here's all the python for the binding generation: https://github.com/floooh/sokol/blob/master/bindgen/ Start at PS: the most 'interesting' problem seems to be "how to deal with strings". The currently supported languages can all consume zero-terminated C strings directly, and all language specific 'structs' directly map to their C counterparts (e.g. they are 'memory-layout-compatible'. For other languages this will be more tricky and may require a proper 'marshalling layer' between the target language and the C APIs. Hope this makes sense :) |
On the subject of strings, SDL2# ended up doing its own UTF8 marshaling: https://github.com/flibitijibibo/SDL2-CS/blob/master/src/SDL2.cs Aside from that we're pretty faithful to the original API, and it wouldn't be hard to annotate what type of string marshaling is necessary. Having a way to generate this would be nice to have, and after 10 years of maintaining SDL2# by hand I think we have enough information to automate this. |
Speaking up as a Rust user of SDL2, and as someone that's made both hand-written and generator-written Rust bindings for SDL2 and GL, all of this is basically a good idea. I don't have too much to add at the moment in terms of what would help from a Rust perspective. The one thing would be that I'd like if function arguments in the machine readable definition always used integers of fixed sizes, rather than C's default numeric types that vary by platform. However, if this can't be done it's still basically fine. |
While I've not used the Rust bindings much, would it make sense to tweak SDL's API to make it more directly map to Rust? e.g., the Rust bindings make up the concept of a "Canvas" in SDL_Renderer, in order to have something with the right lifetime. (As well as things like a These of course aren't documented in SDL (other than the Rust bindings docs), and won't appear in any other SDL tutorials, etc. If we can find a closer match between SDL and what Rust needs, so the SDL bindings don't feel so much like a different library in places, I think that'd be much more pleasant to deal with on both sides. |
I actually have my own separate crates called |
Jumping in here as I have experimented with this problem from a different angle with C# with some major pains and then some minor success. I have crossed friendly paths with @floooh for generating bindings in C# for I have documented all my knowledge / findings into the README and other documentation over at https://github.com/bottlenoselabs/c2cs. Any constructive corrections or call outs is extremely welcome. I am probably on mount stupid. My auto-generated bindings for SDL can be found here: https://github.com/bottlenoselabs/SDL-cs. There are challenges with the SDL API which makes automatic bindgen not so "friendly" when it comes to C#. I am free to discuss this in more which is probably the most value I can bring to this discussion. I use the EDIT: |
The problem with this approach is that C sadly doesn't convey even remotely enough information to generate good APIs from. That's why i'm proposing a (not yet specified, but extensible) format to document all requirements to an API. For example |
I dont know Zig at all but could google that By the way: my LuaJIT SDL binding in https://github.com/sonoro1234/LuaJIT-SDL2 |
Yes, that is correct. This conveys basically the following information:
Whereas
and
This means, we can translate a |
@MasterQ32 I agree with you; I have encountered this problem and so has Silk.NET folks and many others. There appears to be a need for some form of annotations which can be used to direct bindgen more accurately. Like @smcv mentioned earlier, the use of magic comments is one possible solution. This has advantages and disadvantages. What I have noticed in experimentation is that However, the path I'm choosing to go down myself is neither. I decided to just accept that C just does not expose enough information. Instead of trying to add more information to C code (via magic comments or attributes), I'm using auxiliary code to direct bindgen using a plugin mechanism. This works well for my use case because I don't have control over For example, the pattern of |
Dear imgui apparently just released something like this, probably has a lot of work for C++ wrangling but still might be good for the other aspects of metadata generation: https://github.com/dearimgui/dear_bindings |
gendyapi parse all the SDL headers, to generate the DYNAPI files. And it's been very easy to add a json dump of all SDL API which can be useful for generating bindings. I know this is the inverse solution of using a "unique source" and generates the header. but at least, it can help to generate the "unique source" from all header, if that should be chosen. |
Yeah, this seems like a reasonable approach, we generate an API description from the header that can be marked up with more detail by people who are implementing language bindings. |
At this point also it might be worth adding code to handle APIs that have been removed, or at least add a checklist that someone can check. It won't matter once we've finalized the ABI, but it might be useful now. |
the common lisp binding generation relies on c2ffi. i'm not sure how c2ffi relates to clang ast-dump, and what justifies its existence (because i don't know much about ast-dump). |
A very plain XML file might be best, like GL and Vulkan do. |
my strategy is that i have the generated API in one package. it only deals with the basics, like string conversions/encoding, error return codes thrown as exceptions, etc. whatever can be done based on the info formally encoded in the C model. then i have another package that is built on top of the generated one, and contains hand written "lispy" constructs that may use the full power of the host language. |
FTR, this is a related feature request: #2059 (typedef for error return codes). |
I really don't think it would, especially for the pixel format stuff. Also, for languages which don't just dump everything into one module, i.e. has clean separation of concerns, this isn't going to work. |
Even if the language can manage dumping it all into one file, having multi-thousand line source files makes rendered views of the file, such as github's source viewer, crawl and chug, particularly on mobile devices like phones. Just for being able to look something up it's nicer to keep files to, say, 1000 lines or less. |
Look at how I organised SDLAda, not events.events, that's a problem due to the package visibility rules, but the rest. |
Last year, i started a new tool called The tool is meant to model native APIs, and can be found here: https://github.com/MasterQ32/apigen I slowly started working on a SDL2 port into apigen here: apigen is meant to be able to also generate a JSON dump of the API information, so it's easily ingestible by other tools as well My goal is also to eventually allow to support versioned items so you can have functions that were introduces in |
@madebr, just wanted to remind that gendynapi.py
... and it is required to work correctly since it creates the internal SDL dynapi files. This JSON file could be re-used to generate bindings or wiki. Of course gendynapi.py needs some evolution for that. (btw, a "wiki -> source code" notification, like automatic PR creation would be also a good thing).
Just tested, (it also requires touch |
In SDLAda, I define C compatible types with valid ranges where possible, this adds extra type checking at the Ada language level. I would prefer that all data types have ranges specified in addition to sizes so that information can be used in languages that has that facility, admittedly not many. |
I believe it's considered "compatible" for a later version of SDL3 to add more values to an an enum. Eg: 3.2 has an enum of 5 values, in 3.4 the enum might have a 6th value added. |
While exploring how a SDL machine readable API might look like, I created this. Are there any patterns that I forgot about, are hard to express in bindings, are a lot of work, or are missing from the documentation at all? As an example of extra information, |
The xml format looks really interesting. I think it would be beneficial to generate C headers from the xml format, so it can be compared to the actual headers.
A common pitfall of generating bindings for memory-safe languages is knowing if the returned pointer should be How would you handle something like https://wiki.libsdl.org/SDL3/SDL_GetDisplays? It returns a buffer and a count of elements. A python one would probably look like this: (this code should be generated from the bindings XML) def SDL_GetDisplays() -> list[SDL_DisplayID] | None:
count = int()
pointer = SDL_GetDisplays(count) # this calls the native function
if pointer == NULL:
return None
ret = copy_to_new_list_or_whatever(pointer, count)
SDL_free(pointer)
return ret (Please note that I have no idea how bindings in python work, but the above code should give you the general idea.) |
My simple python generator is currently very dumb, and requires you to use Python code to call SDL_GetDisplays without any wrappingUint32 = ctypes.c_uint32
SDL_DisplayID = Uint32
# Get a list of currently connected displays.
SDL_GetDisplays = SDL_LIBRARY.SDL_GetDisplays
SDL_GetDisplays.restype = ctypes.POINTER(SDL_DisplayID)
SDL_GetDisplays.argtypes = [ctypes.POINTER(ctypes.c_int32)]
display_count = ctypes.c_int32()
displays = SDL_GetDisplays(ctypes.pointer(display_count))
for i in range(display_count.value):
print(f"display [{i: 2d}] {displays[i]} name=\"{SDL_GetDisplayName(displays[i]).decode()}\"")
SDL_free(displays) Are you asking for something similar to Microsoft's SAL? extern DECLSPEC _Ret_writes_z_(*count) SDL_DisplayID *SDLCALL SDL_GetDisplays(_Out_opt_ int *count); |
My solution with /// Get a list of currently connected displays.
/// Returns a 0 terminated array of display instance IDs which should be freed
/// with `SDL_free`, or `null` on error; call `SDL_GetError` for more details.
fn SDL_GetDisplays(
/// a pointer filled in with the number of displays returned
count: *c_int,
) ?[*:null]SDL_DisplayID; Which models both the information that the result might be I can't model allocation information yet, but that might be possible to do something like
The type information can then be used to return something like an object with RAII for freeing, length and indexer in C++ for example. |
Honestly, some sort of json/toml format would be best, where types can be specified with ranges (for strongly typed languages): interface = "SDL"
[[types]]
# <typename> = C type
init_flags = "int"
bitset = true # Values are used as bitsets.
[[values]]
INIT_TIMER = 0x0000_0001 # Name created by <interface>_<value>
# ...
[[functions]]
name = "Init" # Name created by <interface>_<name>
return = "int"
[[parameters]]
# <name> = <type>
flags = "init_flags" Something like this, and have one per interface. Using a machine readable syntax means it can be generated and read easily by any language. In Ada, I separate out interface = "SDL"
subinterface = "Video" |
But this should not just dump the pointer types, as they can be complex and the last thing we want is to force people to parse this stuff, i.e. |
It seems SDL is already using bits of SAL. Most of it is limited to These annotations get dumped by {
"comment": "",
"header": "SDL_stdinc.h",
"name": "SDL_wcslcpy",
"parameter": [
"SDL_OUT_Z_CAP (SDL_OUT_Z_CAP(maxlen) wchar_t *dst *REWRITE_NAME)wchar_t *dst",
"const wchar_t *REWRITE_NAME",
"size_t REWRITE_NAME"
],
"parameter_name": [
"param_name_not_specified",
"src",
"maxlen"
],
"retval": "size_t"
}, |
@Susko3 That looks worse than C to parse. |
@Susko3 I suggest that maybe we could just fix the prototype, so that they look standard. About the readable API definitions:
those are my suggestions, but please double-check with @slouken |
This SAL stuff just looks like EXTRA COMPLICATIONS. The aim here should be so that the SDL3 C headers can ALSO be generated from* the description and you really want something EASY to parse.
|
As stated by other folks, C does not provide enough information (especially regarding pointers) to make this possible. |
I think the most workable plan is to create a separate API definition file (XML? something else?) that lives in src/dynapi that contains more robust annotations for the functions. gendynapi.py could even warn if the file is missing annotations for any new SDL API function. |
God! Not XML, it's a pain to parse. Just look at Khronos' mess. |
I've written a parser for the GL and VK xml files more than once. It's really not bad at all. You can write it like once during an afternoon and then it'll just work until the shape of the xml itself updates. The only thing that makes it annoying is that GL and VK stick raw C code into parts of the xml, so anyone not using C has to try harder to interpret that part for their own language. As long as SDL doesn't try to put chunks of raw C into the xml it would be fine. |
If you have an XML library where you DON'T have to write a state machine parser around it, then fine, it's easy. Otherwise, it's not. Having had to do that for khronos' shit. Yes, I am correct there, they broke the number one rule in language agnostic IDL's, they embedded C macros and headers inside that xml. |
I just wrote the state machine of the tree into the call stack. Regardless of those details, we do seem to agree on how an XML version, if made, would need to be done to make it easy to use: no C code embedded in the XML. |
I would rather it was json, then it can be read in easily as a DB which can be queried easily. |
Json is the obvious one i think, godot does it that way for reference. |
Heya!
I’m the author of SDL.zig, an attempt to create a Zig binding for SDL2.
As auto-translating the headers does not convey enough information about the expected types, a lot of APIs are hand-adjusted to actually fit the intent of the SDL api. One example would be:
SDL_Color* colors
has to be translated tocolors: [*]SDL_Color
(pointer to many), and notcolors: *SDL_Color
(pointer to one).Now with the beginning of SDL3 development:
Is the SDL project open to provide a machine-readable abstract definition of the SDL APIs that allow precise generation of C headers, Zig bindings and possibly other languages (C#, Rust, Nim, …) so there’s only one authorative source for the APIs that convey enough information to satisfy all target languages?
Regards
PS.:
I'm willing to spent time and effort on this, also happy to write both the generator and definitions.
The text was updated successfully, but these errors were encountered: