ARROW-1756: [Python] Fix large file read/write error#1276
ARROW-1756: [Python] Fix large file read/write error#1276Licht-T wants to merge 5 commits intoapache:masterfrom
Conversation
wesm
left a comment
There was a problem hiding this comment.
Is this issue limited to Mac? I thought the bug reporter was on Linux, but either way this is a good thing to fix
cpp/src/arrow/io/file.cc
Outdated
There was a problem hiding this comment.
Since N is never changed, just use nbytes? Maybe make it const int64_t nbytes in the function arguments for good measure
cpp/src/arrow/io/file.cc
Outdated
cpp/src/arrow/io/file.cc
Outdated
There was a problem hiding this comment.
Use a more descriptive variable name than i
cpp/src/arrow/io/file.cc
Outdated
python/pyarrow/tests/test_feather.py
Outdated
There was a problem hiding this comment.
Any data will do here, so you could do np.arange here to make this faster rather than generating so much random data
python/pyarrow/tests/test_feather.py
Outdated
|
Thanks for looking at this |
|
@wesm |
|
@wesm I found the reason why. http://man7.org/linux/man-pages/man2/write.2.html
|
|
OK, thanks. I'll wait on you to address the code comments and then we can merge this |
3b5782a to
81c1972
Compare
|
@wesm Fixed. Now it works fine on Linux! |
xhochy
left a comment
There was a problem hiding this comment.
+1, thanks for taking a deep dive into those internals.
Will merge once the build is green.
|
@xhochy I want to solve the Windows issue as same. |
|
@Licht-T yes, please open a JIRA. I would merge this PR once Travis is happy but sadly C++ builds keep timeouting on Travis. I will re-trigger later again and merge on success. |
cpp/src/arrow/io/file.cc
Outdated
| #define ARROW_IO_MAX_CONUT INT32_MAX | ||
| #else | ||
| // see notes on Linux read/write manpage | ||
| #define ARROW_IO_MAX_CONUT 0x7ffff000 |
cpp/src/arrow/io/file.cc
Outdated
| *bytes_read = 0; | ||
|
|
||
| while (*bytes_read != -1 && *bytes_read < nbytes) { | ||
| int64_t i = std::min(static_cast<int64_t>(ARROW_IO_MAX_CONUT), nbytes - *bytes_read); |
There was a problem hiding this comment.
We should use descriptive variable names. I will fix
Change-Id: I14967fb34936b61bc1fd636b702888e95a332765
|
This is infinite-looping on Linux, debugging |
Change-Id: I00ee07fdc0d25f399610db49b2adfa65088e9d63
|
+1 |
|
thanks @Licht-T! |
This is the part of ARROW-1756.