Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-41687: Fix sendfile implementation to work with Solaris #22040

Merged
merged 4 commits into from
Sep 5, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
bpo-41687: Fix sendfile implementation on Solaris
  • Loading branch information
kulikjak committed Sep 1, 2020
commit 769ef151311a69186026aa8ed05a235229adaec1
2 changes: 2 additions & 0 deletions Lib/test/test_asyncio/test_sendfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,8 @@ def test_sendfile_ssl_close_peer_after_receiving(self):
self.assertEqual(srv_proto.data, self.DATA)
self.assertEqual(self.file.tell(), len(self.DATA))

@unittest.skipIf(sys.platform.startswith('sunos'),
"Doesn't work on Solaris")
def test_sendfile_close_peer_in_the_middle_of_receiving(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This single test fails with "AssertionError: ConnectionError not raised" and I am not sure why. With this patch, sendfile is not called if an offset is too high, which is slightly different from other platforms where it is called and sends nothing; but I don't think that causes this issue.

It seems to me that the entire message is sent and close_after doesn't work as expected. But I don't see into asyncio internals that much, so it's just a guess.

It was failing before as well, so it doesn't seem to be a regression. If you have any idea why that might be the case, how to test it further, or how to fix that, I will gladly do so (I don't really want to skip it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was failing before as well, so it doesn't seem to be a regression. If you have any idea why that might be the case, how to test it further, or how to fix that, I will gladly do so (I don't really want to skip it).

Yeah, I don't like skipping it either. Unfortunately I won't be able to assist with the debugging, but if you have time please try to figure this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting closer. It looks like no matter the BUF_SIZE, sendfile always sends at least 131072 bytes, so it sends everything before transport is closed and hence no ConnectionError is raised. If I understand it correctly, the intended behavior is to limit sendfile to send at maximum BUF_SIZE bytes, meaning that transport is closed long before everything is sent (correct me if I'm wrong). I found out that the test passes if DATA size is doubled.

My guess is that this might be a difference in how Solaris handles those buffer limits; I will keep digging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I found the culprit - the reason is that as per documentation: "At present, lowering SO_RCVBUF on a TCP connection after it has been established has no effect". This is on Oracle Solaris, but the bug is old enough that it should be present in all OpenSolaris forks unless they fixed it themselves.

I don't know If you'd prefer having it skipped here for all Solarises or if we should skip it internally since it is an OS bug, not a difference (I think that both are valid).

I checked Illumos source code and I believe they are also affected by this SO_RCVBUF related bug (but I might have missed their fix somewhere where I was not looking and I cannot verify that on Illumos machine).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's skip the test on Solaris and add an elaborate comment why, basically summarizing

"At present, lowering SO_RCVBUF on a TCP connection after it has been established has no effect". This is on Oracle Solaris, but the bug is old enough that it should be present in all OpenSolaris forks unless they fixed it themselves.

srv_proto, cli_proto = self.prepare_sendfile(close_after=1024)
with self.assertRaises(ConnectionError):
Expand Down
17 changes: 17 additions & 0 deletions Modules/posixmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -9518,6 +9518,23 @@ os_sendfile_impl(PyObject *module, int out_fd, int in_fd, PyObject *offobj,
if (!Py_off_t_converter(offobj, &offset))
return NULL;

#if defined(__sun) && defined(__SVR4)
int res;
struct stat st;

do {
Py_BEGIN_ALLOW_THREADS
res = fstat(in_fd, &st);
Py_END_ALLOW_THREADS
} while (res != 0 && errno == EINTR && !(async_err = PyErr_CheckSignals()));
if (ret < 0)
return (!async_err) ? posix_error() : NULL;

if (offset >= st.st_size) {
return Py_BuildValue("i", 0);
}
#endif

do {
Py_BEGIN_ALLOW_THREADS
ret = sendfile(out_fd, in_fd, &offset, count);
Expand Down