Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Provide machine readable API definitions with SDL3 #6337

Open
ikskuh opened this issue Oct 5, 2022 · 66 comments
Open

Feature Request: Provide machine readable API definitions with SDL3 #6337

ikskuh opened this issue Oct 5, 2022 · 66 comments
Assignees
Milestone

Comments

@ikskuh
Copy link

ikskuh commented Oct 5, 2022

Heya!

I’m the author of SDL.zig, an attempt to create a Zig binding for SDL2.

As auto-translating the headers does not convey enough information about the expected types, a lot of APIs are hand-adjusted to actually fit the intent of the SDL api. One example would be: SDL_Color* colors has to be translated to colors: [*]SDL_Color (pointer to many), and not colors: *SDL_Color (pointer to one).

Now with the beginning of SDL3 development:
Is the SDL project open to provide a machine-readable abstract definition of the SDL APIs that allow precise generation of C headers, Zig bindings and possibly other languages (C#, Rust, Nim, …) so there’s only one authorative source for the APIs that convey enough information to satisfy all target languages?

Regards

  • xq

PS.:
I'm willing to spent time and effort on this, also happy to write both the generator and definitions.

@slouken slouken added this to the 3.0 milestone Oct 5, 2022
@slouken
Copy link
Collaborator

slouken commented Oct 5, 2022

Conceptually this is fine with me, as long as it doesn't decrease readability of the headers by end users. If it does, then I would suggest a separate API definition file that's machine readable.

Can you give a sample of what a small header like SDL_sensor.h might look like?

@ikskuh
Copy link
Author

ikskuh commented Oct 10, 2022

Can you give a sample of what a small header like SDL_sensor.h might look like?

Just a heads up: I'm working on that, just happens that i'm at a conference right now. Will definitly post results next week

@smcv
Copy link
Contributor

smcv commented Oct 12, 2022

GNOME's GObject-Introspection is in the same general space as this, and GNOME-adjacent libraries use it to generate bindings, either at compile-time for compiled languages (Vala, C++, Rust) or at runtime for dynamic languages (Python, JavaScript, Perl).

SDL probably can't usefully use GObject-Introspection directly, because GObject-Introspection is designed for GLib's object model, but it's worth looking at GObject-Introspection and seeing what sort of information they needed in order to autogenerate the bindings. It uses magic comments containing annotations; the most important one is usually transfer, which marks whether ownership is transferred between caller and callee.

Another very useful annotation is whether a char * is UTF-8 (like in GTK widgets), the OS's unspecified string encoding (like in Unix filenames and environment variables), or binary data (like in memmove()).

One example would be: SDL_Color* colors has to be translated to colors: [*]SDL_Color (pointer to many), and not colors: *SDL_Color (pointer to one).

In GObject-Introspection, this distinction would be something like:

/**
 * @colors: (array length=n_colors): the palette
 *
 * Set a palette of variable size that is passed as a pointer to (the first element of) an array.
 */
void Example_SetPalette(Picture *self, SDL_Color *colors, size_t n_colors);

/**
 * @colors: (array fixed-size=16): pointer to exactly 16 colors
 *
 * Set a palette of fixed size that is passed as a pointer to (the first element of) an array.
 */
void Example_SetVgaPalette(Picture *self, SDL_Color *colors);

/**
 * @which: an index within the palette
 * @color: (in) (transfer none): the color
 *
 * Change one member of the palette by copying the given color, which is passed by reference.
 */
void Example_SetPaletteEntry(Picture *self, int which, SDL_Color *color);

/**
 * @which: an index within the palette
 * @color: (out caller-allocates): the color
 *
 * Get one member of the palette and store it by overwriting the contents of a struct that is passed by reference.
 */
void Example_GetColorByIndex(Picture *self, int which, SDL_Color *color_out);

@ikskuh
Copy link
Author

ikskuh commented Oct 12, 2022

My proposal wouldn't go that far, but especially wouldn't use C as a data ground truth. I hopefully can finish my example later this day, as the above code doesn't contain even remotely enough data to generate nice Zig or C# code. Ownership transfer is a good point, though!

@smcv
Copy link
Contributor

smcv commented Oct 12, 2022

One insight from GNOME which might be equally useful in SDL is that the most convenient API/ABI for C is not necessarily convenient for bindings. A reasonable number of API entry points in GLib/GTK end up having two versions: one that is convenient for C programmers and marked as not visible to bindings (for example using varargs), and one that is convenient for binding programmers but de-emphasized for C programmers (for example always using an (array,length) pair even if that's not the most natural C representation). Usually one of them calls the other internally, or they both call into a common internal implementation.

@ikskuh
Copy link
Author

ikskuh commented Oct 12, 2022

@slouken: I created a example here: https://github.com/MasterQ32/SDL3-Api-Generator-Example

It implements the minimal stuff to render the Sensor API to both Zig and C. The generated code is not at the level that i want to generate, but it's pretty close.

One thing that's missing still is the ability to abstract something like function macros, which are not part of the linked api, but the compiled-in api.

One cool thing that is possible:
The api generator can later parse the documentation comments and translate them into the language specific documentation format, which means everyone will get nice code comments in their IDE

Important note: I chose Lua for implementation just because it allows for a quick-and-dirty implementation. For a official API generator, i'd probably move to C, as we can remove dependencies by that.

@smcv:

One insight from GNOME which might be equally useful in SDL is that the most convenient API/ABI for C is not necessarily convenient for bindings.

That is true. I think we can model something like that.

Another very useful annotation is whether a char * is UTF-8 (like in GTK widgets), the OS's unspecified string encoding (like in Unix filenames and environment variables), or binary data (like in memmove()).

This one is actually a pretty cool idea. Your comments aren't incorporated yet into the API generator/data format, but it should not be that hard. Array lengths are also a pretty cool annotation, would allow Zig users to use slices ([]T, a pointer + length type) in the exposed API, and the C api is hidden from the user.

@floooh
Copy link

floooh commented Oct 14, 2022

Maybe my binding generators are of interest to the discussion, outlined here:

https://floooh.github.io/2020/08/23/sokol-bindgen.html

TL;DR: I'm running my C headers through clang ast-dump, parse the resulting JSON output into a reduced 'intermediate JSON', and then generate language bindings from this (now automated via Github Actions: https://github.com/floooh/sokol/actions/runs/3122773475)

Depending on target language I'm injecting special cases (e.g. helper functions like this: https://github.com/floooh/sokol-zig/blob/680d37ebcde09794e66380ff30867ca3dafb9f2f/src/sokol/gfx.zig#L4-L26). I think it's important to be able to allow the final code generator to support special treatment for specific declarations, for instance printf()-like functions with variable argument lists usually can't be mapped directly to the target language. For such 'complicated cases' I don't attempt to find a generic solution, but simply inject a manually written function (in some cases not even calling the original function, but 'emulating' it in the target language, for instance here's such a 'formatted print replacement': https://github.com/floooh/sokol-zig/blob/680d37ebcde09794e66380ff30867ca3dafb9f2f/src/sokol/debugtext.zig#L29-L50

Clang ast-dump works ok for my case, because I can control the input C APIs (there's a blurb about "binding friendly APIs" in the blog post). The ast-dump output format isn't guaranteed to remain fixed, but so far (for just parsing declarations) it hasn't changed.

A more robust solution is proabably a "proper" tool based on libclang.

In any case, here's all the python for the binding generation:

https://github.com/floooh/sokol/blob/master/bindgen/

Start at gen_all.py, then look at gen_ir.py (takes the verbose output of clang ast-dump and turns it into a much simplified JSON), and then gen_zig.py, gen_nim.py and gen_odin.py which take the intermediate JSON and generate the bindings.

PS: the most 'interesting' problem seems to be "how to deal with strings". The currently supported languages can all consume zero-terminated C strings directly, and all language specific 'structs' directly map to their C counterparts (e.g. they are 'memory-layout-compatible'. For other languages this will be more tricky and may require a proper 'marshalling layer' between the target language and the C APIs.

Hope this makes sense :)

@flibitijibibo
Copy link
Collaborator

On the subject of strings, SDL2# ended up doing its own UTF8 marshaling:

https://github.com/flibitijibibo/SDL2-CS/blob/master/src/SDL2.cs

Aside from that we're pretty faithful to the original API, and it wouldn't be hard to annotate what type of string marshaling is necessary. Having a way to generate this would be nice to have, and after 10 years of maintaining SDL2# by hand I think we have enough information to automate this.

@Lokathor
Copy link
Contributor

Speaking up as a Rust user of SDL2, and as someone that's made both hand-written and generator-written Rust bindings for SDL2 and GL, all of this is basically a good idea.

I don't have too much to add at the moment in terms of what would help from a Rust perspective. The one thing would be that I'd like if function arguments in the machine readable definition always used integers of fixed sizes, rather than C's default numeric types that vary by platform. However, if this can't be done it's still basically fine.

@sulix
Copy link
Contributor

sulix commented Nov 22, 2022

While I've not used the Rust bindings much, would it make sense to tweak SDL's API to make it more directly map to Rust?

e.g., the Rust bindings make up the concept of a "Canvas" in SDL_Renderer, in order to have something with the right lifetime. (As well as things like a TextureCreator?)

These of course aren't documented in SDL (other than the Rust bindings docs), and won't appear in any other SDL tutorials, etc. If we can find a closer match between SDL and what Rust needs, so the SDL bindings don't feel so much like a different library in places, I think that'd be much more pleasant to deal with on both sides.

@Lokathor
Copy link
Contributor

I actually have my own separate crates called fermium (raw bindings) and beryllium (rust-friendly wrappers). I've never looked too closely at what the sdl2 crate is doing or what any of their internal logic for stuff is.

@lithiumtoast
Copy link

lithiumtoast commented Nov 23, 2022

Jumping in here as I have experimented with this problem from a different angle with C# with some major pains and then some minor success. I have crossed friendly paths with @floooh for generating bindings in C# for sokol using libclang.

I have documented all my knowledge / findings into the README and other documentation over at https://github.com/bottlenoselabs/c2cs. Any constructive corrections or call outs is extremely welcome. I am probably on mount stupid.

My auto-generated bindings for SDL can be found here: https://github.com/bottlenoselabs/SDL-cs. There are challenges with the SDL API which makes automatic bindgen not so "friendly" when it comes to C#. I am free to discuss this in more which is probably the most value I can bring to this discussion.

I use the c2cs tool I created to automatically generate the C# bindings for FNA C dependencies for my fork of FNA called Katabasis; I sponsor @flibitijibibo. The purpose of this fork is to expand my own curiosity for the XNA/MonoGame APIs in a way that organic and makes sense (I have a strong love hate relationship with Microsoft).

EDIT:
I forgot to mention what's interesting about my solution is that I use libclang to extract a minimal necessary .json Abstract Syntax Tree for purposes of generating C# code. Technically speaking this .json file could also be used to generate code for Python or other languages but I have not experimented down this path due to my limited time.

@ikskuh
Copy link
Author

ikskuh commented Nov 23, 2022

I forgot to mention what's interesting about my solution is that I use libclang to extract a minimal necessary .json Abstract Syntax Tree for purposes of generating C# code. Technically speaking this .json file could also be used to generate code for Python or other languages but I have not experimented down this path due to my limited time.

The problem with this approach is that C sadly doesn't convey even remotely enough information to generate good APIs from. That's why i'm proposing a (not yet specified, but extensible) format to document all requirements to an API. For example char * foo in C doesn't say if i can pass NULL or not. It also doesn't say if the pointer is NUL terminated or if it expects only a single char or a fixed number of them. If we can express this information in a file and generate the code from there, we can create way better bindings for most languages (Consider C# ref Point vs Point[] in marshalling)

@sonoro1234
Copy link

sonoro1234 commented Nov 23, 2022

One example would be: SDL_Color* colors has to be translated to colors: [*]SDL_Color (pointer to many), and not colors: *SDL_Color (pointer to one).

I dont know Zig at all but could google that colors: [*c]SDL_Color can be used for automated translation (althought is as unsafe as C code is)

By the way: my LuaJIT SDL binding in https://github.com/sonoro1234/LuaJIT-SDL2

@ikskuh
Copy link
Author

ikskuh commented Nov 23, 2022

I dont know Zig at all but could google that colors: [*c]SDL_Color can be used for automated translation (althought is as unsafe as C code is)

Yes, that is correct. This conveys basically the following information:

  • This pointer can be optionally be NULL
  • This pointer can point to a single item
  • This pointer can point to many items
  • This pointer can point to a sequence terminated by a NUL element

Whereas *SDL_Color conveys this information:

  • The pointer cannot be NULL
  • The pointer points to a single element

and [*]SDL_Color conveys:

  • The pointer cannot be NULL
  • The pointer points to an unknown/externally defined number of elements, ranging from 0 ... (limit-1)

This means, we can translate a *SDL_Color to C# a ref SDL_Color or out SDL_Color parameter, whereas [*]SDL_Color can be translated to SDL_Color[]. At least in a marshalling context

@lithiumtoast
Copy link

lithiumtoast commented Nov 23, 2022

@MasterQ32 I agree with you; I have encountered this problem and so has Silk.NET folks and many others. There appears to be a need for some form of annotations which can be used to direct bindgen more accurately.

Like @smcv mentioned earlier, the use of magic comments is one possible solution. This has advantages and disadvantages.

What I have noticed in experimentation is that libclang exposes getting any Clang attributes for a cursor. Another path forward is to direct bindgen using Clang attributes.

However, the path I'm choosing to go down myself is neither. I decided to just accept that C just does not expose enough information. Instead of trying to add more information to C code (via magic comments or attributes), I'm using auxiliary code to direct bindgen using a plugin mechanism. This works well for my use case because I don't have control over SDL, or sokol, or flecs, etc.

For example, the pattern of SDL_Color* being an array; that can be transformed appropriately to C# via auxiliary code in the form of a plugin. In your other example, of ref Point vs Point[], this pattern would also be handled by auxiliary code in the form of a plugin. Side note: using Point[] would probably not be the best idea and Span<Point> would probably be a better fit; something which I already do for fixed buffers.

@flibitijibibo
Copy link
Collaborator

Dear imgui apparently just released something like this, probably has a lot of work for C++ wrangling but still might be good for the other aspects of metadata generation: https://github.com/dearimgui/dear_bindings

@1bsyl
Copy link
Contributor

1bsyl commented Dec 7, 2022

gendyapi parse all the SDL headers, to generate the DYNAPI files.
I've tried a re-write in python to fix some bug / improve (#6783)

And it's been very easy to add a json dump of all SDL API which can be useful for generating bindings.
of course, extra tags you would need for allocation/pointers are missing ...
but this should be easy to parse when added and specified.

I know this is the inverse solution of using a "unique source" and generates the header. but at least, it can help to generate the "unique source" from all header, if that should be chosen.

@slouken
Copy link
Collaborator

slouken commented Dec 9, 2022

Yeah, this seems like a reasonable approach, we generate an API description from the header that can be marked up with more detail by people who are implementing language bindings.

@slouken
Copy link
Collaborator

slouken commented Dec 9, 2022

At this point also it might be worth adding code to handle APIs that have been removed, or at least add a checklist that someone can check. It won't matter once we've finalized the ABI, but it might be useful now.

@attila-lendvai
Copy link

The ast-dump output format isn't guaranteed to remain fixed, but so far (for just parsing declarations) it hasn't changed.

A more robust solution is proabably a "proper" tool based on libclang.

the common lisp binding generation relies on c2ffi.

i'm not sure how c2ffi relates to clang ast-dump, and what justifies its existence (because i don't know much about ast-dump).

@Lokathor
Copy link
Contributor

A very plain XML file might be best, like GL and Vulkan do.

@attila-lendvai
Copy link

The problem with this approach is that C sadly doesn't convey even remotely enough information to generate good APIs from.

my strategy is that i have the generated API in one package. it only deals with the basics, like string conversions/encoding, error return codes thrown as exceptions, etc. whatever can be done based on the info formally encoded in the C model.

then i have another package that is built on top of the generated one, and contains hand written "lispy" constructs that may use the full power of the host language.

@attila-lendvai
Copy link

attila-lendvai commented Dec 24, 2022

FTR, this is a related feature request: #2059 (typedef for error return codes).

@Lucretia
Copy link

Lucretia commented Mar 19, 2024

@madebr's proposal may make that unnecessary however, so maybe that's worth pursuing?

I really don't think it would, especially for the pixel format stuff.

Also, for languages which don't just dump everything into one module, i.e. has clean separation of concerns, this isn't going to work.

@Lokathor
Copy link
Contributor

Even if the language can manage dumping it all into one file, having multi-thousand line source files makes rendered views of the file, such as github's source viewer, crawl and chug, particularly on mobile devices like phones. Just for being able to look something up it's nicer to keep files to, say, 1000 lines or less.

@Lucretia
Copy link

Lucretia commented Mar 19, 2024

Look at how I organised SDLAda, not events.events, that's a problem due to the package visibility rules, but the rest.

@ikskuh
Copy link
Author

ikskuh commented Mar 20, 2024

Last year, i started a new tool called apigen which was originated from this issue, and totally forgot about the issue itself.

The tool is meant to model native APIs, and can be found here: https://github.com/MasterQ32/apigen
It was designed to be vendored with projects like SDL2

I slowly started working on a SDL2 port into apigen here:
https://github.com/MasterQ32/fakerootz/tree/main/api

apigen is meant to be able to also generate a JSON dump of the API information, so it's easily ingestible by other tools as well

My goal is also to eventually allow to support versioned items so you can have functions that were introduces in 2.0.1 and removed in 3.1.5 or something like that

@1bsyl
Copy link
Contributor

1bsyl commented Mar 20, 2024

I've been thinking about the following approach:

* Create a dumbed down C parser that pre-defines only `__SDLAPIC__`

* Add appropriate `#ifdef`'ery to our headers such that the parser only sees typedefs, function declarations and documentation

* Let the parser output something parsable (json/xml/yaml)

* generators can then spit out language-specific bindings (C#/Java/rust)

* as a stretch goal, the parser can extract the documentation (they are comments on the same line, or the lines before). That way we can greatly simplify/improve `wikiheaders.pl`

@madebr, just wanted to remind that gendynapi.py

  • parses public headers, extract function names / parameter names and types / comments
  • but doesn't parse structs or other things (could be extended probably)
  • can create a .json output (--dump)

... and it is required to work correctly since it creates the internal SDL dynapi files.

This JSON file could be re-used to generate bindings or wiki. Of course gendynapi.py needs some evolution for that.

(btw, a "wiki -> source code" notification, like automatic PR creation would be also a good thing).

A SDL IDL is an interesting idea, as it would solve the same issues and be more flexible in the long run (documentation wise).

Testing the viability of my proposal. Running

mkdir -p /tmp/dummy
touch /tmp/dummy/endian.h /tmp/dummy/inttypes.h /tmp/dummy/stdarg.h /tmp/dummy/stdint.h /tmp/string.h /tmp/wchar.h
cpp -undef -nostdinc -E -P include/SDL3/SDL.h -D__SDLAPIC__ -I include -I /tmp/dummy >/tmp/sdl_naked.h
cpp -undef -nostdinc -E -P include/SDL3/SDL.h -D__SDLAPIC__ -I include -I /tmp/dummy -dM>/tmp/sdl_macros.h

and looking at /tmp/sdl_naked.h and /tmp/sdl_macros.h, I think it contains parse-able data. Adding -C preserves documentation.

Just tested, (it also requires touch /tmp/dummy/wchar.h and touch /tmp/dummy/string.h on my side).
Not sure how it makes thing easier. I mean, I see this strips a lot of things.
but as long as we have our SDLCALL function, struct, enum, typedef). We should be fine ?
and if not, maybe we should change our public API. (like no #define SDL_CONSTANT).

@Lucretia
Copy link

Lucretia commented Mar 20, 2024

Speaking up as a Rust user of SDL2, and as someone that's made both hand-written and generator-written Rust bindings for SDL2 and GL, all of this is basically a good idea.

I don't have too much to add at the moment in terms of what would help from a Rust perspective. The one thing would be that I'd like if function arguments in the machine readable definition always used integers of fixed sizes, rather than C's default numeric types that vary by platform. However, if this can't be done it's still basically fine.

In SDLAda, I define C compatible types with valid ranges where possible, this adds extra type checking at the Ada language level. I would prefer that all data types have ranges specified in addition to sizes so that information can be used in languages that has that facility, admittedly not many.

@Lokathor
Copy link
Contributor

I believe it's considered "compatible" for a later version of SDL3 to add more values to an an enum. Eg: 3.2 has an enum of 5 values, in 3.4 the enum might have a 6th value added.

@madebr
Copy link
Contributor

madebr commented Mar 21, 2024

While exploring how a SDL machine readable API might look like, I created this.
The gist contains a manually crafted incomplete API in XML format, including a XML schema to formally verify it.
To check whether the API is enough to get working bindings, it also includes a crude python bindings generator, that generates this.

Are there any patterns that I forgot about, are hard to express in bindings, are a lot of work, or are missing from the documentation at all?

As an example of extra information,
I added a error item to the return type, so bindings know what vaue indicates an error state. (It's immature right now, and not used by the Python binding)
I also added information about ownership, to appetize Rust's borrow checker.

@Susko3
Copy link
Contributor

Susko3 commented Mar 21, 2024

The xml format looks really interesting. I think it would be beneficial to generate C headers from the xml format, so it can be compared to the actual headers.

Are there any patterns that I forgot about, are hard to express in bindings, are a lot of work, or are missing from the documentation at all?

A common pitfall of generating bindings for memory-safe languages is knowing if the returned pointer should be SDL_freed or not. This should be mentioned in the documentation of each function. And you may be able to infer it from const qualifiers on returned pointers (const char * vs char *)

How would you handle something like https://wiki.libsdl.org/SDL3/SDL_GetDisplays? It returns a buffer and a count of elements.
I would expect all high quality binding libraries to have friendly overloads of that function.

A python one would probably look like this: (this code should be generated from the bindings XML)

def SDL_GetDisplays() -> list[SDL_DisplayID] | None:
    count = int()
    pointer = SDL_GetDisplays(count) # this calls the native function
    if pointer == NULL:
        return None
    ret = copy_to_new_list_or_whatever(pointer, count)
    SDL_free(pointer)
    return ret

(Please note that I have no idea how bindings in python work, but the above code should give you the general idea.)

@madebr
Copy link
Contributor

madebr commented Mar 21, 2024

My simple python generator is currently very dumb, and requires you to use ctypes to arrange the arguments.

Python code to call SDL_GetDisplays without any wrapping
Uint32 = ctypes.c_uint32

SDL_DisplayID = Uint32

# Get a list of currently connected displays.
SDL_GetDisplays = SDL_LIBRARY.SDL_GetDisplays
SDL_GetDisplays.restype = ctypes.POINTER(SDL_DisplayID)
SDL_GetDisplays.argtypes = [ctypes.POINTER(ctypes.c_int32)]

display_count = ctypes.c_int32()
displays = SDL_GetDisplays(ctypes.pointer(display_count))
for i in range(display_count.value):
    print(f"display [{i: 2d}] {displays[i]} name=\"{SDL_GetDisplayName(displays[i]).decode()}\"")
SDL_free(displays)

Are you asking for something similar to Microsoft's SAL?
Adding these to the SDL headers would make the headers safer to use, but less readable.
SDL_GetDisplays would become:

extern DECLSPEC _Ret_writes_z_(*count) SDL_DisplayID *SDLCALL SDL_GetDisplays(_Out_opt_ int *count);

@ikskuh
Copy link
Author

ikskuh commented Mar 21, 2024

How would you handle something like https://wiki.libsdl.org/SDL3/SDL_GetDisplays? It returns a buffer and a count of elements. I would expect all high quality binding libraries to have friendly overloads of that function.

My solution with apigen would express that like this:

/// Get a list of currently connected displays.
/// Returns a 0 terminated array of display instance IDs which should be freed
/// with `SDL_free`, or `null` on error; call `SDL_GetError` for more details.
fn SDL_GetDisplays(
    /// a pointer filled in with the number of displays returned
    count: *c_int,
) ?[*:null]SDL_DisplayID;

Which models both the information that the result might be null, and also is a pointer to a null-terminated sequence of mutable SDL_DisplayID values, while count is a non-optional pointer to a C ABI integer value.

I can't model allocation information yet, but that might be possible to do something like ?[*:null] dtor(SDL_Free) SDL_DisplayID

I would expect all high quality binding libraries to have friendly overloads of that function.

The type information can then be used to return something like an object with RAII for freeing, length and indexer in C++ for example.

@Lucretia
Copy link

Lucretia commented Mar 27, 2024

Honestly, some sort of json/toml format would be best, where types can be specified with ranges (for strongly typed languages):

interface = "SDL"

[[types]]
  # <typename> = C type
  init_flags = "int"
    bitset = true    # Values are used as bitsets.
    [[values]]
      INIT_TIMER = 0x0000_0001   # Name created by <interface>_<value>
      # ...

[[functions]]
  name = "Init"    # Name created by <interface>_<name>
  return = "int"
  [[parameters]]
    # <name> = <type>
    flags = "init_flags"

Something like this, and have one per interface. Using a machine readable syntax means it can be generated and read easily by any language. In Ada, I separate out SDL.Video and other sub parts like Surfaces and Textures into their own packages, this could also be done but on generating your modules, you can combine if you want. i.e.

interface = "SDL"
subinterface = "Video"

@Lucretia
Copy link

init_flags = "int"

But this should not just dump the pointer types, as they can be complex and the last thing we want is to force people to parse this stuff, i.e. int* const *.

@Susko3
Copy link
Contributor

Susko3 commented Mar 28, 2024

My simple python generator is currently very dumb, and requires you to use ctypes to arrange the arguments.
Python code to call SDL_GetDisplays without any wrapping

Are you asking for something similar to Microsoft's SAL? Adding these to the SDL headers would make the headers safer to use, but less readable. SDL_GetDisplays would become:

extern DECLSPEC _Ret_writes_z_(*count) SDL_DisplayID *SDLCALL SDL_GetDisplays(_Out_opt_ int *count);

It seems SDL is already using bits of SAL. Most of it is limited to SDL_stdinc.h, but SDL_PRINTF_FORMAT_STRING (defined as _Printf_format_string_ on msvc) is also used in other headers.

These annotations get dumped by gendynapi.py, but they end up kinda weird:

    {
        "comment": "",
        "header": "SDL_stdinc.h",
        "name": "SDL_wcslcpy",
        "parameter": [
            "SDL_OUT_Z_CAP (SDL_OUT_Z_CAP(maxlen) wchar_t *dst *REWRITE_NAME)wchar_t *dst",
            "const wchar_t *REWRITE_NAME",
            "size_t REWRITE_NAME"
        ],
        "parameter_name": [
            "param_name_not_specified",
            "src",
            "maxlen"
        ],
        "retval": "size_t"
    },

@Lucretia
Copy link

@Susko3 That looks worse than C to parse.

@1bsyl
Copy link
Contributor

1bsyl commented Mar 29, 2024

@Susko3
yes, gendynapi.py isn't correctly parsing some prototype. I wrote it, and I saw there was this kind of strange macros around a few functions. I didn't know that was Microsoft SAL or similar.
Since those dynapi entries where already written, I didn't try to fix the parser or the prototypes.

I suggest that maybe we could just fix the prototype, so that they look standard.

About the readable API definitions:

  • Maybe we could look at the SAL and restrict to a very small subset of features that are useful for SDL bindings.
    (like :
    • the_return_value_should_not_be_freed (or the opposite, depending on the default)
    • this_array_goes_with_this_len
    • dont_produce_a_binding_for_this_function
    • what's more ? )
  • translate this subset of description into something easily parseable. Easiest to me, seems something that would go at the end of function comment. like some tag: SDL_TAG_FREE_RETVAL, SDL_TAG_DONT_BIND, SDL_TAG_ARRAY(array_name, array_len)
  • we could also, as suggested by smcv, add extra C SDL functions more adapted to be binded (and hide some others). see his message above.

those are my suggestions, but please double-check with @slouken

@Lucretia
Copy link

Lucretia commented Mar 29, 2024

This SAL stuff just looks like EXTRA COMPLICATIONS. The aim here should be so that the SDL3 C headers can ALSO be generated from* the description and you really want something EASY to parse.

  • Also, not including header stuff like khronos did, making an absolute mess of the xml, that stuff should be in the C generator.

@Odex64
Copy link

Odex64 commented Jul 26, 2024

As stated by other folks, C does not provide enough information (especially regarding pointers) to make this possible.
A possible solution is to annotate functions (see #9907) in order to know more about them, but I'm not sure if it will ever happen - another workaround would be to assert each function that takes pointers and "guess" additional information whether the function executed correctly or not.

@slouken
Copy link
Collaborator

slouken commented Jul 26, 2024

I think the most workable plan is to create a separate API definition file (XML? something else?) that lives in src/dynapi that contains more robust annotations for the functions. gendynapi.py could even warn if the file is missing annotations for any new SDL API function.

@Lucretia
Copy link

God! Not XML, it's a pain to parse. Just look at Khronos' mess.

@Lokathor
Copy link
Contributor

I've written a parser for the GL and VK xml files more than once. It's really not bad at all. You can write it like once during an afternoon and then it'll just work until the shape of the xml itself updates. The only thing that makes it annoying is that GL and VK stick raw C code into parts of the xml, so anyone not using C has to try harder to interpret that part for their own language. As long as SDL doesn't try to put chunks of raw C into the xml it would be fine.

@Lucretia
Copy link

If you have an XML library where you DON'T have to write a state machine parser around it, then fine, it's easy. Otherwise, it's not. Having had to do that for khronos' shit. Yes, I am correct there, they broke the number one rule in language agnostic IDL's, they embedded C macros and headers inside that xml.

@Lokathor
Copy link
Contributor

I just wrote the state machine of the tree into the call stack.

Regardless of those details, we do seem to agree on how an XML version, if made, would need to be done to make it easy to use: no C code embedded in the XML.

@Lucretia
Copy link

I would rather it was json, then it can be read in easily as a DB which can be queried easily.

@crystalthoughts
Copy link

crystalthoughts commented Aug 30, 2024

Json is the obvious one i think, godot does it that way for reference.
Using the ast output as a base and manually clarifying ambiguous parts is probably the best way to get to v1?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests