Skip to content

Conversation

@hiddenalpha
Copy link

@hiddenalpha hiddenalpha commented Jul 13, 2021

  • Replace 'select' by 'poll' in readBytes to fix segfaults when fd >= 1024
  • Log a few error cases which we concealed in previous code.
  • Fix some concurrency issues.

HINT: I did not add recompiled libs as I only could provide/test the linux lib
anyway. Not sure how this gets handled in this git repo. Let me know.

Edit: (@tresf)

- Replace 'select' by 'poll' to fix segfaults when fd >= 1024
- Log a few error cases which we concealed in old code.
- Fix some concurrency issues.

(wip 4a6ff6c)
@hiddenalpha hiddenalpha changed the title Use poll to fix segfaults with high fd numbers Use poll in readBytes to fix segfaults with high fd numbers Jul 13, 2021
@tresf
Copy link

tresf commented Jul 14, 2021

Thanks. I'd also ask you to read the code and suggestions here: #78. Note, we're short on volunteers here to help merge, so I do apologize for any delay before this hits our master branch.

@hiddenalpha
Copy link
Author

Thx for the hint 👍 Seems as we need a few more tweaks then. Will give it a try.

@hiddenalpha hiddenalpha force-pushed the bugfix-usePollToOvercomeSegfaultsOnFdGt1024 branch from b23270e to ddc359c Compare July 15, 2021 14:11
- Reword some comments.
- Drop logging.
- Use 'select' as fallbck for OsX. But still allow to override using
  -DHAVE_POLL=1 gcc switch.
- Revert accidentally checked-in libjssc.so.

(wip 7e4e60e)
@hiddenalpha hiddenalpha force-pushed the bugfix-usePollToOvercomeSegfaultsOnFdGt1024 branch from ddc359c to ec9d759 Compare July 15, 2021 14:25
@hiddenalpha hiddenalpha requested a review from tresf July 22, 2021 12:33
Copy link

@graynk graynk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this part still needed? It seems like in this implementation we wait for data before returning in native code (either by poll or by select, even on Mac OS), so this sleep can be removed, if I understand everything correctly

//Need to sleep some time
try {
Thread.sleep(0, 100);
}
catch (Exception ex) {
//Do nothing
}

@tresf
Copy link

tresf commented Jul 28, 2021

Maybe I should open individual issues to document all my findings (currently only visible here and there in my scratch-book) so we could discuss what we want do about them.

Fine by me. @GMKennedy and I will be making a new release this week, so I'd like to get this merged in asap if possible. For the record @GMKennedy did compile and test the code on MacOS and Linux with no observable regressions.

@tresf
Copy link

tresf commented Aug 2, 2021

@hiddenalpha, Just checking in on the Sleep(...) comments. I'd like to have a decision on this so we can make this part of the next release.

Copy link
Collaborator

@GMKennedy GMKennedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes implement 'synchronized' function (ie monitor lock) which replaces thread sleep function for handling of contention/concurrency. SerialPort.java has been updated accordingly.

@tresf
Copy link

tresf commented Aug 4, 2021

AFAIK, @GMKennedy's speaking specifically about this branch: https://github.com/java-native/jssc/commits/Poll-Update

We'll add to this PR in the next day and merge.

@tresf tresf merged commit 2c09d72 into java-native:master Aug 4, 2021
@hiddenalpha
Copy link
Author

@hiddenalpha, Just checking in on the Sleep(...) comments. I'd like to have a decision on this so we can make this part of the next release.

Sorry for not responding. But had no access to my account because I was not at work a few weeks. AFAIK this decision is superfluous meanwhile :) But honestly I even could not say if it should be removed or not.

@factoritbv
Copy link

During our thorough testing (mac/linux/raspberry) of our improvements suggested in #126, we have never run into the issue of file descriptors with a value above 1024. However, it seems to be an issue on (some?) linux systems.

Maybe you could have a look and improve our suggested code with your polling mechanism. I'm not sure if it is as portable as select(), which is basically available on every UNIX system. But if it makes it more stable / flexible, I think the code in #126 can be improved with use of the poll() function.

@hiddenalpha
Copy link
Author

... I'm not sure if it is as portable as select(), ...

Yeah :( unfortunately poll is not as portable as select. This was the reason we kept select as a fallback.

... we have never run into the issue of file descriptors with a value above 1024. However, it seems to be an issue on (some?) linux systems.

Unluckily I do not know the exact reason why we ran into such FDs. I only saw that we had them. There were some suspicions that it could come for example from docker or that we use the lib in a jetty where half a dozen apps are running performing lots of file/network IO.

@hiddenalpha hiddenalpha deleted the bugfix-usePollToOvercomeSegfaultsOnFdGt1024 branch July 27, 2025 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better waitEvents implementation for *nix-like systems

5 participants