Skip to content

Commit

Permalink
[Core] fix inaccurate Raylet log message for aborting object creation (
Browse files Browse the repository at this point in the history
…ray-project#24450)

Found many log messages about Not enough memory to create requested object ... when running shuffle tests, even when object store memory is far from full.

It seems when ObjectBufferPool::AbortCreate() is called, Raylet logs Not enough memory to create requested object .... However, ObjectBufferPool::AbortCreate() is called under 3 different codepaths:

    ObjectManager::ReceiveObjectChunk()
    PullManager::UpdatePullsBasedOnAvailableMemory() -> cancel_pull_request_
    PullManager::CancelPull() -> cancel_pull_request_

Only codepath (2) is due to having not enough object store memory. So the logging in ObjectBufferPool::AbortCreate() is moved to the callsites instead, which have more context of the situation and can log with more accurate messages.

Also change codepath (3) to be DEBUG, because it is an expected behavior and can be quite spammy when running shuffle / sort workload.
  • Loading branch information
mwtian authored May 5, 2022
1 parent a424e91 commit 68c4023
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
2 changes: 0 additions & 2 deletions src/ray/object_manager/object_buffer_pool.cc
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,6 @@ void ObjectBufferPool::WriteChunk(const ObjectID &object_id,

void ObjectBufferPool::AbortCreate(const ObjectID &object_id) {
absl::MutexLock lock(&pool_mutex_);
RAY_LOG(INFO) << "Not enough memory to create requested object " << object_id
<< ", aborting";
AbortCreateInternal(object_id);
}

Expand Down
2 changes: 2 additions & 0 deletions src/ray/object_manager/object_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -609,6 +609,8 @@ bool ObjectManager::ReceiveObjectChunk(const NodeID &node_id,
// have to check again here because the pull manager runs in a different
// thread and the object may have been deactivated right before creating
// the chunk.
RAY_LOG(INFO) << "Aborting object creation because it is no longer actively pulled: "
<< object_id;
buffer_pool_.AbortCreate(object_id);
return false;
}
Expand Down
4 changes: 4 additions & 0 deletions src/ray/object_manager/pull_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,8 @@ void PullManager::UpdatePullsBasedOnAvailableMemory(int64_t num_bytes_available)

// Call the cancellation callbacks outside of the lock.
for (const auto &obj_id : object_ids_to_cancel) {
RAY_LOG(DEBUG) << "Not enough memory to create requested object " << obj_id
<< ", aborting.";
cancel_pull_request_(obj_id);
}

Expand Down Expand Up @@ -385,6 +387,8 @@ std::vector<ObjectID> PullManager::CancelPull(uint64_t request_id) {
*request_queue, bundle_it, highest_req_id_being_pulled, &object_ids_to_cancel);
for (const auto &obj_id : object_ids_to_cancel) {
// Call the cancellation callback outside of the lock.
RAY_LOG(DEBUG) << "Pull cancellation requested for object " << obj_id
<< ", aborting creation.";
cancel_pull_request_(obj_id);
}
}
Expand Down

0 comments on commit 68c4023

Please sign in to comment.