Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dropout_path as a structured form of dropout_edge #5531

Merged
merged 18 commits into from
Sep 30, 2022
Merged

Conversation

EdisonLeeeee
Copy link
Contributor

@EdisonLeeeee EdisonLeeeee commented Sep 25, 2022

This PR implements DropPath from MaskGAE: Masked Graph Modeling Meets Graph Autoencoders. DropPath is a structured form of DropEdge, which drops a group of edges (paths) based on random walks.

DropPath follows three steps to sample edges to drop:

  • Sample a set of root nodes $R$ with probability r from a Bernoulli distribution. (0<=r<=1),
  • Perform random walks starting from root nodes $R$
  • Drop edges sampled by random walks

Link to #5452

@codecov
Copy link

codecov bot commented Sep 25, 2022

Codecov Report

Merging #5531 (c87956f) into master (7fb1eb6) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5531      +/-   ##
==========================================
+ Coverage   83.69%   83.72%   +0.02%     
==========================================
  Files         346      346              
  Lines       19049    19080      +31     
==========================================
+ Hits        15943    15974      +31     
  Misses       3106     3106              
Impacted Files Coverage Δ
torch_geometric/utils/dropout.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@EdisonLeeeee EdisonLeeeee self-assigned this Sep 25, 2022
@EdisonLeeeee
Copy link
Contributor Author

A quick question (maybe stupid): how to disable the GitHub bot that always removes labels when I push the code? That's a bit annoying :-(

Copy link
Member

@wsad1 wsad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this.

@rusty1s
Copy link
Member

rusty1s commented Sep 28, 2022

Let's also add it to the CHANGELOG.md. Feel free to merge afterwards.

@EdisonLeeeee
Copy link
Contributor Author

Since there are some big changes made, I'm afraid it is still necessary to check it again :(


row, col = edge_index
sample_mask = torch.rand(row.size(0), device=edge_index.device) <= p
start = row[sample_mask].repeat(walks_per_node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean we'll start many more than walks_per_node random walks for nodes with a high out-degree.
Sorry for the back and forth, but I felt the way you defined start perviously was better and actually meant we start exactly walks_per_node random walks`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Or maybe we can add an option such as walk_by='edge' or walk_by='node' to align them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am missing something. Why do we want to support walk_by='edge'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry I meant we can sample starting nodes node-wise (previous version) or edge-wise (current version). For edge-wise sampling, as you mentioned, we can start random walks for nodes with a high out-degree, this seems to make sense for graphs with particular imbalanced structures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the last one that needs to be resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion. I think we can safely merge this and wait on user feedback. To me, it makes sense to sample per edge since we are dropping paths.

Copy link
Member

@rusty1s rusty1s Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just add a comment for now and clarify in the doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we can let p accept Tensor to specify the nodes to start random walks, rather than randomly sampled from nodes and edges. Let's leave it as future work and merge it now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, i forgot to respond. its good that you merged it :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries :)

Copy link
Member

@rusty1s rusty1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks perfect :)

EdisonLeeeee and others added 2 commits September 30, 2022 00:12
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
@EdisonLeeeee EdisonLeeeee merged commit d8800b6 into master Sep 30, 2022
@EdisonLeeeee EdisonLeeeee deleted the dropout_path branch September 30, 2022 07:53
JakubPietrakIntel pushed a commit to JakubPietrakIntel/pytorch_geometric that referenced this pull request Nov 25, 2022
)

* add dropout_path

* test

* doc

* doc and annotation

* add is_sorted argument

* doc

* drop force_undirected argument

* permute edge ids if edges were sorted

* update test

* update test

* update README

* drop p and q; rename r to p

* changelog

* sample on row

* Update torch_geometric/utils/dropout.py

Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>

* Update torch_geometric/utils/dropout.py

Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants