Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to build to fix creation of wheels. #840

Merged
merged 16 commits into from
Aug 22, 2017
Merged

Changes to build to fix creation of wheels. #840

merged 16 commits into from
Aug 22, 2017

Conversation

robertnishihara
Copy link
Collaborator

@robertnishihara robertnishihara commented Aug 16, 2017

This addresses #843.

It should also do the following.

  • It also addresses Ray installation fails on CentOS. #813 through a hack. On CentOS, Arrow is installed to $ARROW_HOME/lib64 instead of $ARROW_HOME/lib, so in that case we just copy it to $ARROW_HOME/lib. This is not the right solution.
  • It should fix Not possible to install Ray from git #806. After this PR, it should be possible to install Ray from github using
    pip install git+https://github.com/ray-project/ray.git#subdirectory=python
    
    in the past, this would fail in certain settings (e.g., the installation would compile against the wrong Python). You can test that it works on this PR with
    pip install git+https://github.com/robertnishihara/ray.git@9a91b9cf037d1affcd18659af1ad2167c3799226#subdirectory=python
    
  • It also fixes a flatbuffer version conflict in which Arrow was sometimes built using flatbuffers 1.6.0 and Ray pulled in flatbuffers 1.7.1. Now on Linux we just compile flatbuffers ourselves and pass that into the Arrow compilation. Note, we're compiling flatbuffers twice (once for Ray and once for Arrow, and it really only needs to be done once).
  • We're also compiling boost now on Linux (so that we can compile with -fPIC, so that we can link it statically into arrow and numbuf.

Some changes we should consider making.

  • Compile flatbuffers and boost on Mac also.
  • Only pull in flatbuffers once and pass the same one into Ray and Arrow.
  • Remove installation of boost from the installation instructions.
  • In setup.py, move cython from install_requires to setup_requires (and maybe setuptools_scm as well, why not numpy as well?), see https://github.com/apache/arrow/blob/master/python/setup.py.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1600/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1602/
Test PASSed.

@robertnishihara
Copy link
Collaborator Author

This might not completely work yet. I just pip installed one of the wheels I built after this change and saw the following error (upon importing Ray).

Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ray
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/.local/lib/python2.7/site-packages/ray/__init__.py", line 13, in <module>
    from ray.worker import (register_class, error_info, init, connect, disconnect,
  File "/home/ubuntu/.local/lib/python2.7/site-packages/ray/worker.py", line 23, in <module>
    import pyarrow.plasma as plasma
  File "/home/ubuntu/.local/lib/python2.7/site-packages/ray/pyarrow_files/pyarrow/__init__.py", line 32, in <module>
    from pyarrow.lib import cpu_count, set_cpu_count
ImportError: libboost_system.so.1.60.0: cannot open shared object file: No such file or directory

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1603/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1604/
Test PASSed.

@robertnishihara
Copy link
Collaborator Author

Note some relevant discussion here https://issues.apache.org/jira/browse/ARROW-1368.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1608/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1609/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1610/
Test PASSed.

@robertnishihara robertnishihara changed the title Changes to build to fix creation of manylinux wheels. Changes to build to address some of the problems with building the wheels. Aug 17, 2017
@robertnishihara
Copy link
Collaborator Author

After compiling boost ourselves with fPIC (on Linux), things are more or less working. However, Ray still does not work on CentOS. In particular, the plasma manager crashes as soon as it starts up (there's some assertion failure somewhere in flatbuffers).

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1639/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1640/
Test PASSed.

@robertnishihara
Copy link
Collaborator Author

robertnishihara commented Aug 21, 2017

The failure I'm currently seeing when starting a plasma manager on CentOS (or when using a wheel built on CentOS) is the following (an assertion failure immediately when the plasma manager starts, somewhere in flatbuffers).

plasma_manager: /ray/ray/python/ray/core/flatbuffers_ep-prefix/src/flatbuffers_ep-install/include/flatbuffers/flatbuffers.h:582: uint8_t* flatbuffers::vector_downward::make_space(size_t): Assertion `cur_ >= buf_' failed.

Doing this in gbd and typing bt strangely gives

(gdb) bt
No stack.

Searching manually, the backtrace is something like

  1. main
  2. start_server
  3. PlasmaManagerState_init
  4. state->plasma_conn->Connect
  5. SendConnectRequest
  6. CreatePlasmaConnectRequest
  7. builder_.Finish

Actually, now that I look at it, this call to CreatePlasmaConnectRequest is using the flatbuffers that Ray pulls in, but it should probably be using the one that Arrow pulls in.

@robertnishihara
Copy link
Collaborator Author

It looks like the problem was a flatbuffer version mismatch (I'm still a little confused about which flatbuffers library the plasma manager's plasma store client was using though).

Anyway, both Arrow and Ray supposedly use flatbuffers 1.7.1, but the wheels are being built on a docker image that contains flatbuffers 1.6.0 already https://github.com/apache/arrow/blob/master/python/manylinux1/scripts/build_flatbuffers.sh, so that was the version of flatbuffers being used by Arrow (in /usr/bin/flatc).

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1642/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1643/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1644/
Test PASSed.

@robertnishihara robertnishihara changed the title Changes to build to address some of the problems with building the wheels. Changes to build to fix creation of wheels. Aug 21, 2017
@robertnishihara
Copy link
Collaborator Author

cc @rshin, it looks like the problem was caused at least in part by a flatbuffers version mismatch. Ray was building flatbuffers 1.7.1, but Arrow was finding flatbuffers 1.6.0 which was already in the docker image that I was using.

@pcmoritz pcmoritz merged commit be4beb1 into ray-project:master Aug 22, 2017
@pcmoritz pcmoritz deleted the pythonarrowcmake branch August 22, 2017 00:49
@@ -81,12 +81,15 @@ def has_ext_modules(self):
# The BinaryDistribution argument triggers build_ext.
distclass=BinaryDistribution,
install_requires=["numpy",
"cython",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you build your wheel correctly, cython should only be a build/setup_requires but not needed for installation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Trying to fix this in #878.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not possible to install Ray from git
4 participants