Skip to content
Giovanni Bajo edited this page Oct 14, 2024 · 10 revisions

Preview branch

The preview branch is called preview and can be downloaded from GitHub: https://github.com/DragonMinded/libdragon/tree/preview.

This is where development happens, so you can go fully bleeding edge. Notice that there is no guarantee that the APIs will be stable: they can be broken at any time and even removed, before finally landing on trunk. If this does not worry you, feel free to experiment with it.

Features

OpenGL support

Libdragon preview contains a full OpenGL 1.1 implementation, with some extensions and additions (include VBOs that are formally part of OpenGL 1.2). This is also similar to OpenGL ES 2.0.

Read the OpenGL on N64 documentation for more details. Have a look also at the gldemo example: https://github.com/DragonMinded/libdragon/blob/preview/examples/gldemo/gldemo.c

Note that OpenGL 1.1 is still the "old style" or "legacy" way of OpenGL programming. If you are familiar with modern OpenGL you’ll know that it is all about programmable GLSL shaders which unfortunately cannot be implemented on N64 hardware. Therefore we chose version 1.1 because it fits the N64’s capabilities best. It is based on the “fixed function” pipeline, which implements Gouraud shading. The feature set is very similar to what libultra’s official 3D pipeline offers.

The implementation is based on rdpq and features a full RSP T&L pipeline which is the first open source 3D RSP ucode to be released. Moreover, a full CPU pipeline is also available, and is automatically switched to whenever the current GL state is not supported by the RSP pipeline.

Sausage64 is a third party project to handle sausage-link animations that supports libdragon. Besides this, we are currently lacking tools to import meshes, materials, etc.

You can find lots of resources for old style OpenGL programming:

MPEG1 video player

This video player is able to reproduce raw MPEG1 stream. The player is based on pl_mpeg but most of the decoding pipeline (after entropy decoding) has been replaced with a custom RSP ucode. This includes motion compensation, IDCT, dequantization / scaling / oddification, residual calculation. The final step of YUV to RGB conversion is handled through rdpq (so it is performed by the RDP).

The player is extremely efficient. It is able to decode videos up to 2Mbit/s of bitrate at 320x240 at 20-25 FPS, or around 1.5Mbit at a somewhat higher resolution. The player has performed very well in a cross-platform video competition held by SegaXTreme in 2022, including outperforming many same-generation consoles. You can read more about the player performance in this forum post which includes also link to sample ROMs.

Compared to Resident Evil 2, this player outperforms it by a wide margin, because most of the decoding is performed on RSP which is an extremely efficient chip for pixel processing and video decoding. In RE2, most of the decoding was performed on the CPU and only the YUV conversion was done on RSP (a step that, by the way, the RDP is even faster at). That was enough for the low-resolution, low-bitrate, low-framerate videos that RE2 had to use for space constraints, but would have not been enough as a stand-alone player for FMVs.

You can have a look at the videoplayer example to see how to run the player. Notice that the API is subject to change as it will probably need to be redesigned before landing to trunk.

NOTE: normally, MPEG video files come in the MPEG container format that muxes audio/video; the video player in libdragon for now only handles video-only container-less streams that are normally distributed with extension .M1V. ffmpeg can convert from .mpeg to .m1v.

Screenshots of a ROM playing the Big Buck Bunny video with libdragon MPEG-1 player.

Schermata_2022-10-17_alle_18 18 55 Schermata_2022-10-17_alle_18 18 50 Schermata_2022-10-17_alle_18 17 50

Dynamic libraries (DSO)

It is now possible to create dynamic libraries (sometimes called "overlays") and load them at runtime. This allows to reduce memory consumption for parts of code which are not necessary to be always available. For instance, each actor could be compiled into its own dynamic library, and loaded / unloaded depending on the game area where the player is.

Dynamic libraries have the .dso extension, and can be loaded using the standard posix API dlopen(). Functions in the dynamic library can be accessed via dlsym() and then called through a normal function pointer. Moreover:

  • All public (non-static) symbols in a dynamic library are accessible from the main binary via dlsym(). In ELF terms, all public symbols are visible.
  • Dynamic library code can transparently call symbols in the main binary, such as libdragon itself, newlib, engine code, etc. No special provision is required, it is sufficient to just call the functions or access the variables. This makes it extremely easy to split existing code into a dynamic library, as no code changes are basically required.
  • Dynamic libraries can also reference symbols in other dynamic libraries, assuming those were loaded first. For instance, if a.dso references a function in b.dso, it is necessary to load b.dso first, and only later load a.dso, otherwise an assert is hit. It is the responsibility of the programmer to assure that dynamic libraries are loaded in strict dependency order.
  • DSO files can also be compressed, via asset compression just like any other data file, and they are automatically decompressed at load time.
  • The crash inspector features a new page that lists all loaded dynamic libraries, together with their current memory addresses.
  • Stack traces and symbols are correctly resolved also across dynamic library boundaries.

Two examples are provided to show how to use dynamic libraries. Have a look at https://github.com/DragonMinded/libdragon/tree/preview/examples/overlays.

Fast math primitives for floating point 3D calculations

Libdragon includes some math primitives like sinf, cosf, atan2f fmodf, and others, designed to be used in a typical 3D old-skool game programming scenario, where floating point calculations are often performed in a -ffast-math context and do not need to have 1 ULP precision, like standard library. Moreover, most floating point numbers involved in 3d calculations are then eventually converted to fixed point to go into RSP for T&L, so wasting time to obtain 1 ULP is absolutely not necessary.

The provided versions are much faster than the standard library counterparts because of these shortcuts. They do not hijack the standard ones for compatibility concerns, but can be expressly invoked using the fm_ prefix. See the documentation in fmath.h for more information.

VADPCM audio compression

The audio mixer library now supports VADPCM compression. In particular, the wav64 format has been enhanced to support also optional VADPCM compression, for both mono and stereo files. VADPCM is a special variant of ADPCM that has been designed to be fast to decompress on RSP. VADPCM is very light on the RSP, and provides a very good audio quality, with a compression factor of 3.5:1 for 16-bit files.

The format was designed by Nintendo but not documented; thanks to the work of Vanadium in Skelly64, there is now a clean room, open source implementation available of both the compressor and decompressor in C, that we integrated in libdragon and rewrote in RSP for optimization.

Compression is performed by audioconv64 during the conversion to .wav64. It is now active by default as it gives a very good balance between resource usage and quality. It is anyway controlled by the new --wav-compress option so that it can be disabled if needed. The runtime code is mostly unaffected: you can call wav64_open and wav64_play just like uncompressed files. If the file is compressed, though, you must call the new wav64_close to dispose it if not needed anymore.

Opus audio compression

The audio mixer library now supports Opus compression. Just like VADPCM, you can use audioconv64 to compress a WAV file using the option --wav-compress 3 to select the Opus compressed format. The implementation is based on the reference libopus library, with a custom RSP ucode to accelerate playback. You can read more about Opus support here: https://github.com/DragonMinded/libdragon/wiki/Opus-decompression

Kernel and multithreading

Libdragon now features an experimental, optional kernel for multithreading programming. You can have a look at src/kernel, and in particular kernel.h for an overview of the APIs.

The kernel currently features most basic multithreading primitives:

  • Preemptive threads, with hard priority
  • Support for both joinable threads and detached threads
  • Mutexes (both reentrant and normal)
  • Condition variables
  • Semaphores and queues
  • C11 atomic variables (atomic_int, etc.)
  • C11 thread-local storage (thread_local keyword)
  • Stack overflow detection via canaries
  • Newlib is now build with threading support so it correctly protects with mutexes global state (eg: the heap allocator)

You are free to experiment with threading at this point, as basic support seems stable. The main issue is that libdragon itself is not thread safe as it's never been designed with multi-threading in mind. Work is ongoing to add proper threading support to it. A main hurdle is adapting rspq and thus all RSP-based libraries.

Frame limiter

Libdragon now supports a sophisticated FPS limiter which works also with non integer divides of the TV refresh rate. So it is now possible for instance to limit a game to 40 FPS (for instance) and get a smooth frame-limited result. This is done via the new API display_set_fps_limit().

In addition to this, there is also a new API to estimate the delta time at which next frame should be calculated (display_get_delta_time()). It uses a kalman filter over the times required by previous frames to be displayed, which provides for a more stable (smoother) estimation compared to just measuring CPU time between frames in the main loop.

The FPS calculation (display_get_fps()) has also been updated to match the same algorithm; compared to the previous one, it reacts more quickly to sudden frame changes, and is also more stable in its output.

Added several iQue APIs

  • Added interrupt callback support for iQue-specific interrupts
  • Added some SKC APIs (Secure Kernel Calls)
  • Added support for NAND flash controller (read/write with ECC support)
  • Added ATB support, for mapping NAND blocks into the virtual PI bus
  • Added a complete BBFS implementation for accessing the flash filesystem, both reading and writing. Access can be performed with standard C/POSIX APIs using the virtual filesystem "bbfs:/".