-
Notifications
You must be signed in to change notification settings - Fork 3.9k
ARROW-699: [C++] Resolve Arrow and Arrow IPC build issues on Windows; #449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See code comments. There are some code formatting issues that would be fixed by clang-format (I think you can get this on Windows: http://llvm.org/builds/), but the make format target only works on Linux/OS X
cpp/src/arrow/io/file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use the C++11 codecvt header for this? e.g. https://github.com/wesm/feather/blob/master/cpp/src/feather/io.cc#L169
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the Google style guide forbids mutable reference arguments: https://google.github.io/styleguide/cppguide.html#Reference_Arguments, so this would need to be std::wstring* result
cpp/src/arrow/io/file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See Google style guide on variable names https://google.github.io/styleguide/cppguide.html#Variable_Names
cpp/src/arrow/io/file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variables name
cpp/src/arrow/io/io-file-test.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this code copy and pasted from https://msdn.microsoft.com/en-us/library/a9yf33zb.aspx? If so, what is the license? It looks like this is MS Public License, but we need to be sure. In general, if any code is contributed to an ASF project, it must be your original work, otherwise it must be attributed to a source with a compatible license. I might suggest writing a new function here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See "Category A" licenses in https://www.apache.org/legal/resolved.html#category-a
cpp/src/arrow/io/io-file-test.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use Hungarian notation (https://google.github.io/styleguide/cppguide.html#Windows_Code), see note below though
cpp/src/arrow/io/io-file-test.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function name
cpp/src/arrow/io/file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this code is from StackOverflow: http://stackoverflow.com/a/26914562/776560 -- you cannot use StackOverflow code in ASF projects.
Since we have C++11, I think we should be able to use the locale/codecvt libraries in the STL
|
codecvt isn't available in gcc, see https://github.com/wesm/feather/blob/master/cpp/src/feather/io.cc#L81 |
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, there's a minor rebase conflict, would you mind rebasing? I'm not sure what to do on the possibility of a UTF16 decoding error, we can either address it in this patch or in a follow up patch -- in the latter case, can you open a JIRA? Thanks!
cpp/src/arrow/io/file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this can fail either with an std::range_error or a decoding error set in the utf16_converter.state():
https://msdn.microsoft.com/en-us/library/ee191684.aspx#Anchor_5
Might be worth writing a unit test to trigger these cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sure, I will create separate ticket for that. I have seen other places ( https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/file.cc#L134 ) without exception handling. It seems all of them should be covered at once.
… Running unit tests in Appveyor.
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, merging. thanks @Maxris! The build has passed here https://ci.appveyor.com/project/MaxRisuhin/arrow/build/1.0.23
Author: Antoine Pitrou <antoine@python.org> Closes apache#449 from pitrou/PARQUET-1071-arrow-filewriter-close-idempotent and squashes the following commits: 8c3fb19 [Antoine Pitrou] PARQUET-1071: Check that arrow::FileWriter::Close() is idempotent Change-Id: Ie3e34cdf2cd1f095a34ebf89abfb3c951b744c5e
Fixes #13. --------- Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Resolve Arrow and Arrow IPC build issues on Windows; Running unit tests in Appveyor.
Changes description:
Current file.cc implementation ( https://github.com/apache/arrow/compare/master...MaxRis:ARROW-699?expand=1#diff-1b2fb57add5bb8f21e28a707f24462b0L161 ) assumes that input file name is encoded with utf-16 inside std::string. But unit tests are passing just utf-8 compatible C-strings.
Util method Utf8ToUtf16 introduced ( https://github.com/apache/arrow/compare/master...MaxRis:ARROW-699?expand=1#diff-1b2fb57add5bb8f21e28a707f24462b0R156 ) to convert utf-8 to utf-16 (std::wstring).
io-file-test has FIleIsClosed method which uses method _close(FILE_HANDLE) to test if file handle valid or invalid. Result is interpreted as file was closed or not. MSVC C runtime implementation by default crashes application if input param is invalid. To overwrite this behavior it's needed to set custom hander (https://github.com/apache/arrow/compare/master...MaxRis:ARROW-699?expand=1#diff-05724c5d85bf64720fa85ef3012e470dR61). More info here: https://msdn.microsoft.com/en-us/library/ksazx244.aspx
Message and FileWriter classes keeps their internal implementation as private member of unique_ptr of FORWARD declared type, for example:
MSVC compiler requires constructor and destructor of Message class be defined. Currently, they are defined by default, and because of this, compiler places auto generated code into the .hpp file, which is not visible for others libs during the linking (We got unresolved linking issues). The solution is to define destructors explicitly. More there http://stackoverflow.com/a/42158611/2266412 and there http://stackoverflow.com/a/6089065/2266412