Skip to content

Addresses missed comments from multichunk object transfer PR. #1908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 16, 2018

Conversation

elibol
Copy link
Contributor

@elibol elibol commented Apr 15, 2018

Addresses missed comments from #1827.
This also fixes a bug when passing in the object manager configuration.

object manager config bug fix.
addresses other comments from ray-project#1827.
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4911/
Test PASSed.

@@ -196,6 +219,21 @@ class RayConfig {
/// corresponding driver. Since spillback currently occurs on a 100ms timer,
/// a value of 100 corresponds to a warning every 10 seconds.
int64_t actor_creation_num_spillbacks_warning_;

/// Time out, in milliseconds, to wait before retrying a failed pull in the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout should be one word

actor_creation_num_spillbacks_warning_(100) {}
actor_creation_num_spillbacks_warning_(100),
// TODO: Setting this to large values results in latency, which needs to
// be addressed. This timeout is often on the critical path to object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical path "for"

store_notification_(main_service, config.store_socket_name),
// release_delay of 2 * config.max_sends is to ensure the pool does not release
store_notification_(main_service, config_.store_socket_name),
// release_delay of 2 * config_.max_sends is to ensure the pool does not release
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use plasma_default_release_delay_? Coupling it with max_sends seems pretty strange to me

Copy link
Contributor Author

@elibol elibol Apr 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the scenario we had in mind when we implemented this. If we have max_sends objects split into 2 chunks, and the first chunk of each is gotten and released before each second chunk is gotten, then any release delay <max_sends (or <=max_sends if the bound is inclusive) will result in the first object actually being released. When we get the second chunk of the first object that was gotten, the associated get call to the object store client will incur the usual cost of a get. Perhaps a release delay of max_sends+1 is sufficient.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4919/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4925/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4931/
Test FAILed.

@pcmoritz
Copy link
Contributor

retest this please

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4932/
Test PASSed.

@pcmoritz pcmoritz merged commit cff3776 into ray-project:master Apr 16, 2018
royf added a commit to royf/ray that referenced this pull request Apr 22, 2018
* master:
  Handle interrupts correctly for ASIO synchronous reads and writes. (ray-project#1929)
  [DataFrame] Adding read methods and tests (ray-project#1712)
  Allow task_table_update to fail when tasks are finished. (ray-project#1927)
  [rllib] Contribute DDPG to RLlib (ray-project#1877)
  [xray] Workers blocked in a `ray.get` release their resources (ray-project#1920)
  Raylet task dispatch and throttling worker startup (ray-project#1912)
  [DataFrame] Eval fix (ray-project#1903)
  [tune] Polishing docs (ray-project#1846)
  [tune] [rllib] Automatically determine RLlib resources and add queueing mechanism for autoscaling (ray-project#1848)
  Preemptively push local arguments for actor tasks (ray-project#1901)
  [tune] Allow fetching pinned objects from trainable functions (ray-project#1895)
  Multithreading refactor for ObjectManager. (ray-project#1911)
  Add slice functionality (ray-project#1832)
  [DataFrame] Pass read_csv kwargs to _infer_column (ray-project#1894)
  Addresses missed comments from multichunk object transfer PR. (ray-project#1908)
  Allow numpy arrays to be passed by value into tasks (and inlined in the task spec). (ray-project#1816)
  [xray] Lineage cache requests notifications from the GCS about remote tasks (ray-project#1834)
  Fix UI issue for non-json-serializable task arguments. (ray-project#1892)
  Remove unnecessary calls to .hex() for object IDs. (ray-project#1910)
  Allow multiple raylets to be started on a single machine. (ray-project#1904)

# Conflicts:
#	python/ray/rllib/__init__.py
#	python/ray/rllib/dqn/dqn.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants