Skip to content

Conversation

@Odysseus1710
Copy link
Contributor

On QNX7.1 (aarch64) I ran into the issue that my app is silently killed via SIGPIPE when trying to connect to a host which is available but has no MQTT broker running.

This is caused by connect() and the following select() calls within BaseSocket::connect do not return any error resulting in a write attempt on a not-connected socket.

When also checking for any socket errors BaseSocket::connect will fail already preventing any further action (e.g. ssl handshake) on this unconnected socket.

int err;
socklen_t len = sizeof(err);
getsockopt(socket, SOL_SOCKET, SO_ERROR, &err, &len);
if (err != 0) return -6;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return code -6 is already used above, it would be better if you used -8 here, as I think it's not used. The goal of using different value is for error reporting, to know where it failed (since each function has its own value). It's like a poor man exception type. For example, if the connectWith method return -7, you know that the connect failed since the socket never became writable.

I'm not sure I understand the issue here. The select method is a plain wrapper around select, so getsockopt(SO_ERROR) should return the same value as what select is returning. So maybe the code could be simplified to:

            if (select(false, true) > 0)
            {
                // Restore blocking behavior here
                if (::fcntl(socket, F_SETFL, socketFlags) != 0) return -3;
                // And set timeouts for both recv and send
                if (::setsockopt(socket, SOL_SOCKET, SO_RCVTIMEO, &timeoutMs, sizeof(timeoutMs)) < 0) return -4;
                if (::setsockopt(socket, SOL_SOCKET, SO_SNDTIMEO, &timeoutMs, sizeof(timeoutMs)) < 0) return -4;
                // Ok, done!
                return 0;
            }

Let me know if it fix the issue for you, I'll commit the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see, I thought the return values mirror the MQTTv5::ErrCodes.

The problem of the issue is that the in the described scenario (broker not running) the select wrapper returns 1 (success) but the error code of SO_ERROR is 111 since this is still a valid file descriptor but connection failed. This is also reproducible on Ubuntu e.g.

On most systems it will just fail on the consecutive write but some stricter POSIX OS (like QNX) will kill the app (if SIGPIPE not catched/ignored)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense then to add the change. Can you change the return value to -8 (it's not used in any code path) so when user report the error, I can figure out from where it comes. Thank you for your PR once again.

@Odysseus1710 Odysseus1710 changed the title Fix: Fix: Prevent write on un-connected socket Aug 1, 2025
@X-Ryl669 X-Ryl669 merged commit 0d86093 into X-Ryl669:master Aug 1, 2025
@X-Ryl669
Copy link
Owner

X-Ryl669 commented Aug 1, 2025

Thanks for your change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants