Skip to content

inquiry into mutex locking changes in Device and Queue #520

Closed
@diablodale

Description

@diablodale

I'm unsure of some mutex locking changes in commit 55066c5

dai::DeviceBase::closed changed from atomic<bool> to pair of std::mutex + bool

The two places I find these vars are used DeviceBase::isClosed() and DeviceBase::close() correctly implement the mutex+bool pair.
My concern is that there is no known case for this code change and no functional bug.
This change adds code complexity, more cpu, and additional memory use when not needed.
And finally, it makes a substantial behavior change. This change creates a blocking behavior where the original code did not.
Now multiple threads which call these functions will block rather than quickly returning.
Since this is a blocking behavior change, this is a risky change for no known need.

When I refactored this code months ago, I had my head around the locking needs of the multiple threads running. At that time I felt the atomic<bool> was the better choice for simplicity and its non-blocking behavior. It didn't matter which thread closed a device; it only mattered that one thread ran the close code. I acknowledge that I can be wrong and write bugs. But that isn't the case here as there is no known bug and no imagined case.

int DeviceBase::getXLinkChunkSize() {
    checkClosed();

    return pimpl->rpcClient->call("getXLinkChunkSize").as<int>();
}

checkClosed() there is useless.
It calls isClosed() which locks the mutex and returns the bool value of closed.
If it is true then checkClosed() throws which then prevents the call to pimpl->rpcClient->call().
But the checking of the value and the throw are disconnected making the test useless.

Time Thread 1 Thread 2
1 DeviceBase::getXLinkChunkSize()  
2 checkClosed() returns false  
3 …os pauses thread… DeviceBase::close()
4 …os pauses thread… std::unique_lockstd::mutex lock(closedMtx);
5 …os pauses thread… if(!closed) {
6 …os pauses thread… closeImpl();
7 …os pauses thread… closed = true;
8 pimpl->rpcClient->call("getXLinkChunkSize").as();  
9 call(…)  
10 💥 or something fails/bad/errors/etc  

Changing the behavior of the closed/closedMtx to be blocking doesn't help any of the cases that I can readily find. Instead, the above commit causes mutex locking and blocking of threads in the multithreaded world of depthai+xlink sending/receiving packets and making RPC calls. I think this change should be reverted.

dai::XLinkConnection::closed changed from atomic<bool> to pair of std::mutex + bool

Same everything as above. No known need, more complex, and behavior change to blocking.
XLinkConnection::isClosed() is now blocking. Therefore XLinkConnection::checkClosed() is now blocking.
Suprisingly, I didn't find any code calling XLinkConnection::checkClosed() so I don't have a code example.
I also think this change should be reverted.

ai::LockingQueue::destructed changed from std::atomic<bool> to to pair of std::mutex + bool plus redundant if() tests

Same everything as above. No known need, more complex, and (less concerning) behavior change to blocking.
I intentionally removed the if(destructed) and you agreed in my commit fc4b314.
Comments start at #366 (comment)
I don't now have the entirety of that codebase in my head. I would need time to recreate (ugh) my knowledge.
But I trust the decision we both made in Feb...unless you have a specific bug or imagined scenario. 🤔

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions