Closed
Description
System information
Ray 0.7.0-dev
Describe the problem
The following workload will hang on test_get_parallel(n=5)
with console messages like
pyarrow.lib.PlasmaStoreFull: object does not fit in the plasma store
(pid=23447) 2019-05-27 17:58:37,324 INFO worker.py:392 --
The object with ID ObjectID(c3e2fb3725ff2634d366070bb2e4689c01000000)
already exists in the object store.
That's not expected -- it should raise an error. Ideally it would run successfully, since the working set actually can fit in memory, but that's outside of the scope of this issue.
Source code / logs
import numpy as np
import time
import ray
@ray.remote
class Actor(object):
def some_expensive_task(self):
return np.zeros(25 * 1024 * 1024, dtype=np.uint8)
def test_get_serial():
a = Actor.remote()
i = 0
start = time.time()
for _ in range(100):
ray.get(a.some_expensive_task.remote())
i += 1
if i % 10 == 0:
print("Calls per second", i / (time.time() - start))
a.__ray_terminate__.remote()
def test_get_parallel(n=10):
i = 0
actors = [Actor.remote() for _ in range(n)]
start = time.time()
for _ in range(100):
pending = [a.some_expensive_task.remote() for a in actors]
while pending:
[done], pending = ray.wait(pending, num_returns=1)
i += 1
if i % 10 == 0:
print("Calls per second", i / (time.time() - start))
for a in actors:
a.__ray_terminate__.remote()
if __name__ == "__main__":
ray.init(object_store_memory=100 * 1024 * 1024)
print("Test serial")
test_get_serial()
print("Test parallel 1")
test_get_parallel(n=1)
print("Test parallel 5")
test_get_parallel(n=5)
Metadata
Assignees
Labels
No labels