Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RealtimeSanitizer to the CMake build system #7764

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

sakertooth
Copy link
Contributor

See title. More info on this code sanitizer and how to use it can be found here.

@sakertooth sakertooth marked this pull request as draft March 11, 2025 10:53
@sakertooth sakertooth marked this pull request as ready for review March 13, 2025 20:33
@sakertooth sakertooth added the needs testing This pull request needs more testing label Mar 15, 2025
Clang gives a warning (-Wperf-constraint-implies-noexcept) if noexcept is not added along with attributes like [[clang::nonblocking]]
@JohannesLorenz JohannesLorenz changed the title Add RealtimeSantizer to the CMake build system Add RealtimeSanitizer to the CMake build system Mar 16, 2025
@JohannesLorenz JohannesLorenz self-requested a review March 16, 2025 18:17
Copy link
Contributor

@JohannesLorenz JohannesLorenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is generally good.

Of course, this cannot fix the 1000s of expected findings and thus cannot add a CI.

Reminder that it might make sense to update the wiki (and then, maybe update submodule doc/wiki inside this PR?)

@sakertooth
Copy link
Contributor Author

CI broke. Apparently you can't update the wiki directly within the PR, so I'll stash my recent changes and apply them to the wiki after merge.

@messmerd
Copy link
Member

Could you also put LMMS_NONBLOCKING on these methods?

  • Effect::processImpl
  • Effect::processBypassedImpl
  • Instrument::play
  • Instrument::playNote

These are audio processing methods that run on an audio thread and should ideally be real-time safe

@sakertooth
Copy link
Contributor Author

Could you also put LMMS_NONBLOCKING on these methods?

  • Effect::processImpl

  • Effect::processBypassedImpl

  • Instrument::play

  • Instrument::playNote

These are audio processing methods that run on an audio thread and should ideally be real-time safe

The realtime attribute should be recursive. If it wasn't, there's no telling how many places we would need to put it, and as changes come along we would have to make sure that attribute is where it is all the time.

That's why I believe the Clang documentation states that you put the attribute at the entry points.

@sakertooth
Copy link
Contributor Author

I put the attribute on the audio device callbacks, the main render function, and the worker thread run function, so it should be applied where necessary, even on thread boundaries.

@sakertooth
Copy link
Contributor Author

I'll actually apply the macro to the FIFO writer thread instead of the main render function actually. The clang documentation expects it should be placed where threads with realtime execution start executing only.

@messmerd
Copy link
Member

@sakertooth

The realtime attribute should be recursive. If it wasn't, there's no telling how many places we would need to put it, and as changes come along we would have to make sure that attribute is where it is all the time.

The plugins are built separately though, so the compiler will never see the [[clang::nonblocking]] attribute when compiling them. And plugin developers should know that those methods they are expected to implement should be real-time safe.

@sakertooth
Copy link
Contributor Author

@sakertooth

The realtime attribute should be recursive. If it wasn't, there's no telling how many places we would need to put it, and as changes come along we would have to make sure that attribute is where it is all the time.

The plugins are built separately though, so the compiler will never see the [[clang::nonblocking]] attribute when compiling them. And plugin developers should know that those methods they are expected to implement should be real-time safe.

Could you test this for me real quick? I don't have Clang 20 at the moment, but I'm pretty sure it applies to the plugins since the code will we are still executing is plugin code through the entry points.

And I think plugin implementers should know about real time safety from the get-go. If we put it there, then why not everywhere else? I understand the benefit of knowing what functions should be real time safe, but there is a huge chance we will miss functions that are supposed to be running in real time.

If we do this, then I find it's only right to apply it uniformly, which isn't really sustainable.

@messmerd
Copy link
Member

Could you test this for me real quick? I don't have Clang 20 at the moment, but I'm pretty sure it applies to the plugins since the code will we are still executing is plugin code through the entry points.

I don't have Clang 20 either currently, but if the [[clang::nonblocking]] attribute is not visible to the plugin translation units, there is no way the compiler could know what is supposed to be real-time safe while building the plugin DLL/SO, unless the compiler stores that information in the object files or static library that the plugins are linked against or some other compiler magic, but I wouldn't count on it.

And I think plugin implementers should know about real time safety from the get-go.

The entire point of using [[clang::nonblocking]] is so that the real-time requirements are explicitly stated and can be enforced. If we thought it was good enough that real-time safety was something people "should know about", this PR would not exist.

If we put it there, then why not everywhere else?

The methods I mentioned are the main audio processing methods for all the plugins. There are few places better suited to add LMMS_NONBLOCKING than there.

And importantly, the Effect and Instrument classes are meant to form the basis of LMMS's plugin API. It is a public API used by plugin developers, not a hidden implementation detail. And in any good API, you must very clearly communicate how methods are expected to be used, and leverage the type system as much as you can to make it difficult to use incorrectly. The [[clang::nonblocking]] attribute gives us the opportunity for the first time to not just specify but also enforce that real-time requirement in our plugin API, so it would make no sense to not use it there.

The current plugin API is in very bad shape due to people treating plugins and the plugin API like they're part of LMMS core when they are supposed to be separate entities. This has made it difficult to tell where a plugin stops and the LMMS core begins which makes it hard to even call them "plugins". Ideally, a plugin developer should not need to read even a single line of code from LMMS core or know anything about it in order to write an LMMS plugin, because the plugin API should tell them everything they need to know. The fact that this is very much not the case is a massive failure IMO. Let's not make it worse.

@sakertooth
Copy link
Contributor Author

I don't have Clang 20 either currently, but if the [[clang::nonblocking]] attribute is not visible to the plugin translation units, there is no way the compiler could know what is supposed to be real-time safe while building the plugin DLL/SO, unless the compiler stores that information in the object files or static library that the plugins are linked against or some other compiler magic, but I wouldn't count on it.

RTsan doesn't need to know AFAICT, it makes those decisions based on what functions are called in the call stack. We call the plugin functions on worker threads, so RTsan should definitely pick it up. I think for this situation, we need testing to make sure, otherwise it is a debate on theoreticals.

From Clang themselves:

At run-time, if RTSan detects a call to malloc, free, pthread_mutex_lock, or anything else known to have a non-deterministic execution time in a function marked [[clang::nonblocking]] it raises an error.


The entire point of using [[clang::nonblocking]] is so that the real-time requirements are explicitly stated and can be enforced. If we thought it was good enough that real-time safety was something people "should know about", this PR would not exist.

It serves two purposes: one is to raise an error on a violation of real-time safety when executing a function, and yes it also effectively serves as an indicator that this is a real-time function, but the main point is to catch real-time violations more easily. That's always been the point. The PR would still exist to serve that main purpose.

However, I do not agree with putting [[clang::nonblocking]] in only those specific spots you listed. That decision does not seem like a cohesive one and completely ignores everything else that should be marked for real-time safety. For the same reasons you want LMMS_NONBLOCKING to be applied for those functions, the other ones as well (like Song::processNextBuffer, Sample::play, tons of them..) are perfect candidates given that reasoning.

And importantly, the Effect and Instrument classes are meant to form the basis of LMMS's plugin API. It is a public API used by plugin developers, not a hidden implementation detail. And in any good API, you must very clearly communicate how methods are expected to be used, and leverage the type system as much as you can to make it difficult to use incorrectly. The [[clang::nonblocking]] attribute gives us the opportunity for the first time to not just specify but also enforce that real-time requirement in our plugin API, so it would make no sense to not use it there.

But you have not given a good reason why only those functions. As I've mentioned, it is only fair to respect the public API everywhere else in the code, but this isn't sustainable long term unless we support a more general interface (i.e., specifying it in an abstract class that everyone else inherits such details from). What I see happening is a mixture of the use of LMMS_NONBLOCKING in some places, and other places, not so much. I don't like the idea of that inconsistency.

The current plugin API is in very bad shape due to people treating plugins and the plugin API like they're part of LMMS core when they are supposed to be separate entities. This has made it difficult to tell where a plugin stops and the LMMS core begins which makes it hard to even call them "plugins". Ideally, a plugin developer should not need to read even a single line of code from LMMS core or know anything about it in order to write an LMMS plugin, because the plugin API should tell them everything they need to know. The fact that this is very much not the case is a massive failure IMO. Let's not make it worse.

I agree. I just don't think I agree with the over-emphasis on applying it in one specific spot in the code where it has no functional effect but to signal to developers that the function is real-time safe. The space that involves applying it to only the specific entry-points of real-time execution (where this PR puts them) is limited and manageable. However, for places that we want to mark as real-time safe? The surface area for where it should be applied increases to a point where I don't even see it being sustainable long term as the APIs change.


I wont have time to work on LMMS anymore, so if you want the changes you can add them to the branch. There's no point in debating over this. Note that I do think it is important and useful to signal that the function is real-time safe, but having it just for the plugin API for that reason in particular isn't "complete" as an idea. It will most likely lead to inconsistent habits and result in that macro being in places where it shouldn't be out of feeling (e.g. functions that are called by other real-time functions that aren't even in the public API for a certain class).

@sakertooth sakertooth added the needs takeover The original author is looking for someone else to continue this pull request label Mar 31, 2025
@messmerd
Copy link
Member

messmerd commented Apr 1, 2025

RTsan doesn't need to know AFAICT, it makes those decisions based on what functions are called in the call stack. We call the plugin functions on worker threads, so RTsan should definitely pick it up.

That might be true when compiling the LMMS core, but when compiling a plugin, the compiler has no way of knowing that somewhere up the call stack the plugin's process method will be called by a function marked [[clang::nonblocking]] because that information does not exist anywhere in the plugin's source files or included headers. It probably doesn't even know that the process method will be called at all. The only way for the compiler to learn that information is if it stored it somehow when compiling the LMMS core and can check it at link time, but I would not count on this.

However, I do not agree with putting [[clang::nonblocking]] in only those specific spots you listed. That decision does not seem like a cohesive one and completely ignores everything else that should be marked for real-time safety.

I'm not advocating putting it everywhere haphazardly. Only places where it is needed. The process methods are by far the most important methods of a plugin and (to my knowledge) the only real-time methods that plugins implement, which makes them an obvious choice to apply the attribute.

For the same reasons you want LMMS_NONBLOCKING to be applied for those functions, the other ones as well (like Song::processNextBuffer, Sample::play, tons of them..) are perfect candidates given that reasoning.

None of those examples are part of LMMS's public API though. A plugin developer does not need to know they exist.

But you have not given a good reason why only those functions. As I've mentioned, it is only fair to respect the public API everywhere else in the code, but this isn't sustainable long term unless we support a more general interface (i.e., specifying it in an abstract class that everyone else inherits such details from).

  • It's a public API, and in any good public API, all requirements need to be clearly communicated to the people using it (plugin developers).
  • There is no "public API everywhere else in the code". The plugin API is the only public API that LMMS has (except maybe the src/common/ stuff), so applying the nonblocking attributes there is not incohesive or random. All other APIs are internal to the LMMS application.
  • RtSan static analysis should be enabled for plugins. Without marking the real-time methods with [[clang::nonblocking]], the compiler will likely be unable to infer that the process methods are supposed to be real-time safe when trying to compile a plugin.

I just don't think I agree with the over-emphasis on applying it in one specific spot in the code where it has no functional effect but to signal to developers that the function is real-time safe. The space that involves applying it to only the specific entry-points of real-time execution (where this PR puts them) is limited and manageable.

Like I said, the plugins and LMMS are two separate entities. The plugin process methods are the entry-points of real-time execution for plugins.

I wont have time to work on LMMS anymore, so if you want the changes you can add them to the branch.

I will be gone for a while too (going on vacation for a couple weeks starting next week). You've done a good job on this PR so far and I'd like to test it out once I have the time, though it may be a bit.

@sakertooth
Copy link
Contributor Author

RtSan static analysis should be enabled for plugins. Without marking the real-time methods with [[clang::nonblocking]], the compiler will likely be unable to infer that the process methods are supposed to be real-time safe when trying to compile a plugin.

Apologies for the misunderstanding, I was under the assumption that RTSan only had use at runtime, not compile-time. I think I am now in agreement with putting it there.

From Clang:

RTSan performs its analysis at run-time but shares the [[clang::nonblocking]] attribute with the Function Effect Analysis system, which operates at compile-time to detect potential real-time safety violations. For comprehensive detection of real-time safety issues, it is recommended to use both systems together.

It's very evident that I did not read this full paragraph, or skimmed over it.

None of those examples are part of LMMS's public API though. A plugin developer does not need to know they exist.

I was thinking of the other public:, apologies for the misunderstanding.

Like I said, the plugins and LMMS are two separate entities. The plugin process methods are the entry-points of real-time execution for plugins.

I incorrectly assumed RTSan was only useful at run-time, but I still think if LMMS were to run normally, RTSan would catch violations from within plugins at run-time. I don't necessarily see how they RTsan wouldn't catch the violations at run-time any other way unless the process methods were called elsewhere, maybe in a distinct program, but I could be wrong/don't see this happening anytime soon.

It also does not seem like the static analysis goes through all real-time function paths for any given function marked nonblocking. Instead, it will just flag the ones called in the function not marked nonblocking as a problem. So maybe the goal is to eventually apply it to all real-time function paths after we make them real-time, but it would still stand that runtime analysis is better at catching violations than static analysis.

So, if I understand this correctly, the main problem mentioned is that RTSan's static analysis won't have any impact regarding plugins since they are compiled separately. The confusing part there is that I can say that for many real-time functions that weren't marked real-time that are compiled and linked into lmmsobjs. So the only real benefit I see is that since the plugin API is meant to be a bit more separate from all of LMMS, the macro mainly serves to communicate to developers that the function is supposed to run in real-time.

Overall though, I am a bit confused about the whole thing, so I'll have to postpone this a couple months late if anything.

You've done a good job on this PR

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs takeover The original author is looking for someone else to continue this pull request needs testing This pull request needs more testing
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

3 participants