Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow greater than memory allocation for plasma store on Mac #3450

Closed
devin-petersohn opened this issue Dec 1, 2018 · 5 comments · Fixed by #3464
Closed

Allow greater than memory allocation for plasma store on Mac #3450

devin-petersohn opened this issue Dec 1, 2018 · 5 comments · Fixed by #3464
Assignees

Comments

@devin-petersohn
Copy link
Member

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): OSX 10.12
  • Ray installed from (source or binary): source
  • Python version: 3.6
  • Exact command to reproduce: (See below)

Describe the problem

In Modin, it would be great if we could specify plasma_directory="/tmp" and object_store_memory=n * physical_mem. It seems to work fine on Ubuntu, but I am getting the following error on Mac:

Exception: The requested object store memory size is greater than the total available memory.

Source code / logs

from psutil import virtual_memory
mem_bytes = virtual_memory().total
object_store_memory = 8 * mem_bytes
plasma_directory = "/tmp"
ray.init(
    redirect_output=True,
    include_webui=False,
    redirect_worker_output=True,
    use_raylet=True,
    ignore_reinit_error=True,
    plasma_directory=plasma_directory,
    object_store_memory=object_store_memory,
)
@robertnishihara
Copy link
Collaborator

Really, that works on Linux? I don't think you actually want this behavior because it will start swapping (even before it hits the total amount of memory) and freeze your laptop.

@devin-petersohn
Copy link
Member Author

I guess it doesn't on 0.6. On 0.5.X it worked.

I don't understand why this would cause the laptop to freeze. The OS would manage the swapping, and it would be slower than in-memory, would you expect that to break the system?

@robertnishihara
Copy link
Collaborator

Maybe not literally freeze, but in the past when I start using too much memory, I've seen things become sufficiently unresponsive that I've had to reboot the machine.

@devin-petersohn
Copy link
Member Author

Just to clarify, here is what we need:

  • plasma store on disk ("/tmp")
  • having larger than memory dataframes supported with the object_store_memory parameter

The OS maintains the paging. We have been experimenting with this on 0.5.3, and while it is slower than purely in-memory, it works and allows 10's of GB dataframes on a laptop. This is a very important requirement in Modin.

@atumanov
Copy link
Contributor

atumanov commented Dec 2, 2018

I think it's a reasonable request for the Modin use case. We already provide the ability to specify the plasma directory as a mount point (e.g., for huge pages). We should try mounting a large file as tmpfs mountpoint, passing that to the object store and evaluate performance. This, combined with explicitly specifying object store memory should just work. If there's some internal check in python that overrides the specified object store memory, capping it to available system memory, I'd say it's a bug, because the plasma dir could point to a larger pool of memory.

devin-petersohn added a commit to devin-petersohn/ray that referenced this issue Dec 4, 2018
pcmoritz pushed a commit that referenced this issue Dec 7, 2018
* Removing the check about the size re: #3450

* Addressing comments

* Update services.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants