Skip to content

Conversation

@afender
Copy link
Member

@afender afender commented May 29, 2020

double-free bug found by @rlratzel
pagerank and tmp thrust vectors are both scope managed now, but raw pointers to their device memory are std::swapped inside the solver. By the time the scope ends for pagerank its device pointer may be pointing to the data that was freed by tmp.
Just doing a cpy for now as a fix since this may soon be replaced by the pattern accelerator.
A thrust::swap would probably be ideal if the vectors were passed to this section of the code instead of raw pointers.

@afender afender requested a review from a team as a code owner May 29, 2020 23:14
@afender afender requested a review from rlratzel May 29, 2020 23:14
@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are passing non-owning pointers, they should be marked as const (mark the pointer itself is const, not the values pointed by the pointer, or we can use iterators instead to be more generic as well) to be safe.

Better establish this convention in our codebase, but hope we can retire this code sooner than later.

@BradReesWork BradReesWork added this to the 0.15 milestone Jun 1, 2020
@afender afender changed the base branch from branch-0.15 to branch-0.14 June 1, 2020 15:37
@afender afender modified the milestones: 0.15, 0.14 Jun 1, 2020
@afender afender requested review from a team as code owners June 1, 2020 15:39
number: {{ git_revision_count }}
string: cuda{{ cuda_version }}_{{ git_revision_count }}
number: {{ GIT_DESCRIBE_NUMBER }}
string: cuda{{ cuda_version }}_{{ GIT_DESCRIBE_HASH }}_{{ GIT_DESCRIBE_NUMBER }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is wrong with how you branched or you're targeting the wrong branch. These are v0.15 changes.

// A thrust::swap would probably be more efficent if the vectors were passed everywhere
// instead of pointers. std::swap is unsafe though. Just copying for now, as this may soon be
// replaced by the pattern accelerator.
copy(n, pr, tmp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and just FYI, what you could do instead is something like

pagerank(value_t* pr, ...) {
  rmm::device_uvector<value_t> tmp{};

  value_t* p_new = pr;
  value_t* p_old = tmp.data().get();
  ...
  // at the end of iteration
  if (iter % 2 == 0) {
    p_new = tmp.data().get();
    p_old = pr;
  }
  else {
    p_new = pr;
    p_old = tmp.data().get();
  } 
}

@BradReesWork BradReesWork modified the milestones: 0.14, 0.15 Jun 2, 2020
@BradReesWork BradReesWork changed the base branch from branch-0.14 to branch-0.15 June 2, 2020 00:03
@afender
Copy link
Member Author

afender commented Jun 3, 2020

#923 made it to 0.15 as well

@afender afender closed this Jun 3, 2020
@afender afender self-assigned this Jul 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants