Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy channelhandle by value instead of pointers #4672

Merged
merged 2 commits into from
Feb 14, 2022

Conversation

ywwg
Copy link
Member

@ywwg ywwg commented Feb 12, 2022

Store channelhandles in messages by value instead of pointers. This fixes issues with pointers to deleted channelhandles being accessed, causing crashes, mostly with QT6.

Converting to values means we can no longer trivially initialize the union/struct by memsetting all of the subtypes, because two of the struts are no longer trivial.

Instead, we memset what we think is the largest datatype. SetEffectChainParameters includes a bool, an enum, and a double, so it's the largest one.

@ywwg ywwg marked this pull request as draft February 12, 2022 15:48
@ywwg
Copy link
Member Author

ywwg commented Feb 12, 2022

hm ok I guess this doesn't work because now the struct is non-trivial:

Error: /home/runner/work/mixxx/mixxx/src/engine/effects/message.h:43:48: error: ‘void* memset(void*, int, size_t)’ clearing an object of non-trivial type ‘struct EffectsRequest::<unnamed union>::<unnamed>’; use assignment or value-initialization instead [-Werror=class-memaccess]
   43 | #define CLEAR_STRUCT(x) memset(&x, 0, sizeof(x));

it builds on my machine though.

@uklotzde
Copy link
Contributor

This failure was expected. There is a reason why pointers are used.

@ywwg ywwg changed the title [PROOF OF CONCEPT NOT FOR MERGE] Copy channelhandle by value instead of pointers [proof of concept fix] Copy channelhandle by value instead of pointers Feb 12, 2022
@ywwg
Copy link
Member Author

ywwg commented Feb 12, 2022

the commit that made the change doesn't give a lot of hints: b2a28bd

@ywwg
Copy link
Member Author

ywwg commented Feb 12, 2022

probably we used memset because: https://stackoverflow.com/questions/11812555/how-to-zero-initialize-an-union

so therefore we just need to figure out which is the biggest struct item and zero that out.

@@ -97,10 +95,10 @@ struct EffectsRequest {
} RemoveEffectChain;
struct {
EffectStatesMapArray* pEffectStatesMapArray;
const ChannelHandle* pChannelHandle;
ChannelHandle channelHandle;
Copy link
Contributor

@uklotzde uklotzde Feb 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting that ChannelHandle with its custom constructor is not usable in this union. Removing the memset seems to work, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. Unfortunately that means the data isn't guaranteed to be zeroed out. According to what I've read online, as long as we memset that largest possible struct in the union, that will zero out all the data. I believe that SetEffectChainParameters is the largest one since it has three items, one of which is a double.

Another option would be to remove the default constructor and force callers to construct messages with specific types in mind -- then we'd know for sure nobody's accessing uninitialized data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems really brittle and easy to break when refactoring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree -- for now we think it's just better to state that we don't guarantee initialization and callers are responsible for it. We can clean this up with the followup rewrite using variant or whatever

src/engine/effects/engineeffectchain.h Outdated Show resolved Hide resolved
@ywwg
Copy link
Member Author

ywwg commented Feb 13, 2022

I believe as long as we call memset on the largest struct item, that will be sufficient for zeroing out the data

@ywwg
Copy link
Member Author

ywwg commented Feb 13, 2022

I'm going to squash this down since there are some bad intermediate commits

…ixes issues with pointers to deleted channelhandles being accessed, causing crashes, mostly with QT6.

Converting to values means we can no longer trivially initialize the union/struct by memsetting all of the subtypes, because two of the struts are no longer trivial.

Instead, we memset what we think is the largest datatype. SetEffectChainParameters includes a bool, an enum, and a double, so it's the largest one.
@ywwg ywwg force-pushed the channelhandle-memory branch from 4def2d5 to d3603b8 Compare February 13, 2022 01:55
@ywwg ywwg changed the title [proof of concept fix] Copy channelhandle by value instead of pointers Copy channelhandle by value instead of pointers Feb 13, 2022
@ywwg ywwg marked this pull request as ready for review February 13, 2022 01:56
@uklotzde uklotzde dismissed their stale review February 13, 2022 16:57

Pre-review

@daschuer
Copy link
Member

While this is already a good way forward which is already mergable, I have not yet understand how it avoids the bad_allic exception. Do we have an evidence that this fixes the issue?

#undef CLEAR_STRUCT
// zero out the struct with the largest size to ensure the entire union memory is set to
// zero.
memset(&SetEffectChainParameters, 0, sizeof(SetEffectChainParameters));
Copy link
Member

@daschuer daschuer Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name the anonymous union here and then clear the memory by sizeof(union Payload)?
All other ideas I have in mind are breaking during refectoring.

However, I actually think that a zeroed memory is in the same way dangerous as a ununutalized memory.
We must not use it in both cases. It looks like the rest of the control structures take already care of it. So if zeroing has an effect, it will only obfusticate an issue we have anyway.

Thats why I would prefere to not initialize the memory. It makes more sense to assert, the validity of the control structures and introduce constructors, that take care of them in one call and cannot fail. We should verify the code that the union is never accessed without checking the type before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no that's not how it works unfortunately. the union itself does not have a sizeof -- only the members therein.

If I rewrite the constructor to enforce a type being specified at construction time, will this PR get merged? I don't want to do any more work on this if it's going to be controversial -- not everyone agrees we should go with this fix. (@Be-ing ?)

Copy link
Member

@daschuer daschuer Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we give the union a name, you can access the sizeof() it. But I still believether is no need to zero out the memory.

I also don't mind to commit that as an intermediate fix.

@ywwg
Copy link
Member Author

ywwg commented Feb 14, 2022

While this is already a good way forward which is already mergable, I have not yet understand how it avoids the bad_allic exception. Do we have an evidence that this fixes the issue?

Yes. I traced the bad alloc to the following:

  • at https://github.com/mixxxdj/mixxx/blob/main/src/effects/effectchain.cpp#L397 we assign the pointer in a message by taking the address of the channelhandle member.
  • When that pointer is later accessed for use in resizing the channelhandle map, somewhere along the way the original thing being pointed to has been deleted / moved.
  • The memory is invalid, and we request a humongous iSize, causing a bad alloc

We don't know exactly why the channelhandle pointer has gone invalid, but I don't think we need to know that to fix the bug.

After this change, even running the UnEjectTest many times does not crash, whereas it crashed about 50% of the time before

@daschuer
Copy link
Member

How about just removing the initializing?

@ywwg
Copy link
Member Author

ywwg commented Feb 14, 2022

I think you're right, zeroing out data won't help if we're not initializing correctly anyway so it doesn't really buy us anything.

Add comment to make clear callers are responsible for initializing the values correctly.
Double-checked call points to make sure all creations were correct.
Removed unused struct parameter that I found during this process.
@ywwg
Copy link
Member Author

ywwg commented Feb 14, 2022

hah! found an unused parameter by checking the call sites.

Copy link
Member

@daschuer daschuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Merge?

@ywwg ywwg requested a review from Be-ing February 14, 2022 17:17
@ywwg
Copy link
Member Author

ywwg commented Feb 14, 2022

I'd like to give @Be-ing a chance to chime in -- we plan to rewrite this subsystem so this is just a stop-gap fix

@Be-ing
Copy link
Contributor

Be-ing commented Feb 14, 2022

Let's not let the perfect be the enemy of the good. Getting rid of the horribly unsafe union would be a good next step. The current situation is that we have a crash, so going from crash to no crash is good even if the code isn't perfect.

@daschuer daschuer merged commit 69bdba5 into mixxxdj:main Feb 14, 2022
@ywwg ywwg mentioned this pull request Feb 14, 2022
@daschuer
Copy link
Member

@ywwg Maybe it also fixes the issue described here:
#4487
Can you have a look if this is likely?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants