-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dropout_path
as a structured form of dropout_edge
#5531
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5531 +/- ##
==========================================
+ Coverage 83.69% 83.72% +0.02%
==========================================
Files 346 346
Lines 19049 19080 +31
==========================================
+ Hits 15943 15974 +31
Misses 3106 3106
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
A quick question (maybe stupid): how to disable the GitHub bot that always removes labels when I push the code? That's a bit annoying :-( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this.
Let's also add it to the |
Since there are some big changes made, I'm afraid it is still necessary to check it again :( |
|
||
row, col = edge_index | ||
sample_mask = torch.rand(row.size(0), device=edge_index.device) <= p | ||
start = row[sample_mask].repeat(walks_per_node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this mean we'll start many more than walks_per_node
random walks for nodes with a high out-degree.
Sorry for the back and forth, but I felt the way you defined start perviously was better and actually meant we start exactly walks_per_node
random walks`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Or maybe we can add an option such as walk_by='edge'
or walk_by='node'
to align them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am missing something. Why do we want to support walk_by='edge'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry I meant we can sample starting nodes node-wise (previous version) or edge-wise (current version). For edge-wise sampling, as you mentioned, we can start random walks for nodes with a high out-degree, this seems to make sense for graphs with particular imbalanced structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the last one that needs to be resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion. I think we can safely merge this and wait on user feedback. To me, it makes sense to sample per edge since we are dropping paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just add a comment for now and clarify in the doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can let p
accept Tensor
to specify the nodes to start random walks, rather than randomly sampled from nodes and edges. Let's leave it as future work and merge it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, i forgot to respond. its good that you merged it :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks perfect :)
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
) * add dropout_path * test * doc * doc and annotation * add is_sorted argument * doc * drop force_undirected argument * permute edge ids if edges were sorted * update test * update test * update README * drop p and q; rename r to p * changelog * sample on row * Update torch_geometric/utils/dropout.py Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de> * Update torch_geometric/utils/dropout.py Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
This PR implements
DropPath
from MaskGAE: Masked Graph Modeling Meets Graph Autoencoders. DropPath is a structured form of DropEdge, which drops a group of edges (paths) based on random walks.DropPath follows three steps to sample edges to drop:
r
from a Bernoulli distribution. (0<=r
<=1),Link to #5452