-
Notifications
You must be signed in to change notification settings - Fork 3.9k
ARROW-4911: [R] Progress towards completing windows support #3932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-4911: [R] Progress towards completing windows support #3932
Conversation
| @@ -0,0 +1,9 @@ | |||
| # Download static arrow from rwinlib | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems weird to me as a non-R users. Can you explain to me what this is about and whether this is going to be permanent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also find it weird to not have the ability to build the static Arrow library using endogenous (aka inside apache/arrow) scripts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes for sure. This is to help us get the package in CRAN.
While CRAN supports compiling packages with native code, it has several restrictions. Therefore, it is a common practice to precompile binaries for CRAN in Window that are downloaded when the package is build and tested in the CRAN build machines.
Short term, I think we can start by compiling ourselves the arrow binaries, but long term, there is nothing preventing us from automating this process in the arrow codebase.
Out of curiosity, how does this work with pip install packages? Does https://pypi.org provide sufficient infrastructure to compile all arrow binaries from source? Or do arrow maintainers upload precompiled binaries? How are those binaries built? Are the pip install binaries built automatically in the arrow project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On https://pypi.org we provide pre-compiled binaries that are built and uploaded on each Arrow release. We have build infrastructure in https://github.com/apache/arrow/tree/master/dev/tasks to build all the binary releases of Arrow directly using CI services (for Python and CentOS/Debian packages). pip install from the source package on PyPI is only working when you have the C++ binaries installed, we hope to provide at one time a source Python package that is self-contained but as >95% of all users use the binary packages, this is not a concern at the moment.
Short term, I think we can start by compiling ourselves the arrow binaries, but long term, there is nothing preventing us from automating this process in the arrow codebase.
Yes, we should automate this for R/CRAN as we do it for Python. As far as I understand it currently, you would need a static libarrow.a, libparquet.a and possibly a libarrow_bundled_thirdparty_dependencies.a built with MinGW available that is then pulled by CRAN to build the final package? It would then be sufficient to update the build of these static libs on each release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the following libraries would be needed:
- parquet
- arrow
- thrift
- boost_regex-mt-s
- double-conversion
To automate building the R binaries in the arrow repo we would have to mimic the following repos (for the packages we need, not all):
It's probably not a ton of work since they already contain AppVeyor scripts; however, we might hit additional issues we are currently not aware of while publishing to CRAN. So I would suggest we focus on publishing in CRAN and then start automating additional build components as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I'm a bit lost as to how getting on CRAN is going to be possible without Linux packages in the upstream package managers. This hasn't been explained anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes for sure, CRAN requires two mayor platforms, which we now support: OS X and Windows. Here is the reference:
Package authors should make all reasonable efforts to provide cross-platform portable code. Packages will not normally be accepted that do not run on at least two of the major R platforms. Cases for Windows-only packages will be considered, but CRAN may not be the most appropriate place to host them.
|
Are we missing anything else to merge this one? Thanks! |
|
Actually, this looks related to this change: Are there any useful pointers out there to troubleshot this tool? Or can someone already understand what this failures mean? Investigating... |
|
Figure it out, output from running Apache Rat locally... Updating PR... |
r/src/Makevars.win
Outdated
| # specific language governing permissions and limitations | ||
| # under the License. | ||
|
|
||
| VERSION = 0.12.0.9000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also update dev/release/00-prepare.sh when you need to put version in files?
See also: https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L92-L98
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, updated.
|
hi, what's the best way to get Arrow running within R on Windows? I notice that the conda pkg r-arrow https://anaconda.org/conda-forge/r-arrow does not have a Windows build target - are there any out there? |
|
@cmorgan my understanding is that the R community intends to make a CRAN submission after the 0.13 release is out (being voted on right now) |
This now installs cleanly on win builder without the warnings. |
|
There is only the expected |
|
ah, did this need to go into 0.13? If you have to submit a patched version to CRAN it's up to the R community |
|
Those are only changes in the r/ directory, likely we would release something after arrow 0.13 is out so that we can effectively depend on it. |
…ersion 0.13.0 of C++ arrow lib on windows, enable parquet
|
This PR is what I'd like to submit as the first release of the R package. it passes I can submit to CRAN either way, but I guess it would be better that this is merged first ? |
|
Win builder results: https://win-builder.r-project.org/t8xaTt5HRKiO/ |
|
results on 🍎 with home brew on this pr: https://travis-ci.org/romainfrancois/arrow-r-ci/builds/515248734 ✅ |
|
@romainfrancois you're a committer so you can merge this when you're satisfied if you publish to CRAN please be careful to not advertise the package as being an official artifact |
|
I still have to learn how exactly a pr is squashed. It does not look like you’d just use the github ui as usual. I’ll get up to speed on that. This is far from done, so we probably won’t advertise too much anyway. What would it mean to be an official artifact though ? |
|
You have never merged a patch? The merge tool is here, take a look at the README https://github.com/apache/arrow/tree/master/dev#arrow-developer-scripts |
|
Official releases are only the signed and checksum'd artifacts that are voted on by the PMC. Take a look at the source and binary artifacts in the 0.13.0 release vote |
|
Thanks for the info. |
|
@romainfrancois is it ready to go? |
|
yes @kszucs but I'll take this as an opportunity to learn about the merge tools ... on it today. |
Support for building [R] package in Windows x64/x86.
Fixes https://issues.apache.org/jira/browse/ARROW-4911 which helps us work towards releasing in CRAN: https://issues.apache.org/jira/browse/ARROW-3204.