Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] GraphGym via PyTorch Lightning and Hydra 🚀 #5132

Open
2 of 16 tasks
rusty1s opened this issue Aug 4, 2022 · 10 comments
Open
2 of 16 tasks

[Roadmap] GraphGym via PyTorch Lightning and Hydra 🚀 #5132

rusty1s opened this issue Aug 4, 2022 · 10 comments

Comments

@rusty1s
Copy link
Member

rusty1s commented Aug 4, 2022

🚀 The feature, motivation and pitch

The overall goal of this roadmap is to ensure a tighter connection between PyG core and the GraphGym configuration manager. Furthermore, an additional goal is to not re-invent the wheel in GraphGym and make use of popular open-source frameworks whenever applicable, e.g., for configuration managament, training, logging, and autoML.

As such, this roadmap structures itself into different components such as general improvements (e.g., tighter connection between PyG and GraphGym), PyTorch Lightning integration, and Hydra integration as our configuration tool.

General Roadmap

  • Add register functionality to models in PyG core
  • Remove any layer/model definition of GraphGym and move it to PyG core
  • Expose a graphgym bash script in a bin/ folder - GraphGym usage should not require manually cloning of PyG
  • Better and more user-friendly documentation
  • Adding HeteroData support
  • Adding pooling layers
  • ...

PyTorch Lightning Integration

GraphGym training experience can be improved for scalability, mixed precision support, logging and checkpoints with PyTorch Lightning integration.

  • Integrate a LightningModule into GraphGym
  • Update train method with PL Trainer and the LightningModule implementations
  • Refactor load_ckpt and save_ckpt with PL checkpoint save and load method
  • Integrate LightningDataset, LightningNodeData and LightningLinkData modules
  • ...

Hydra Integration

Users of PyG should be able to write GraphGym configurations by being able to make full use of PyG functionality. In particular, we want to allow access to any dataset, any data transformation pipeline, and any GNN layer/model. For this, we need to follow a structured/composable configuration, e.g., as introduced here

defaults:
  - dataset: KarateClub
  - transform@dataset.transform:
      - NormalizeFeatures
      - AddSelfLoops
  - model: GCN
  - optimizer: Adam
  - lr_scheduler: ReduceLROnPlateau
  - _self_

model:
  in_channels: 34
  out_channels: 4
  hidden_channels: 16
  num_layers: 2
  • Use variable interpolation, e.g., model.in_channels = ${dataset.num_features} and model.out_channels = ${dataset.num_classes}
  • ...

Weights & Biases Integration (TBD)

  • ...

AutoML (TBD)

  • ...

cc @pyg-team/biotax-team

@rusty1s rusty1s added the feature label Aug 4, 2022
@rusty1s rusty1s changed the title GraphGym [Roadmap] GraphGym via PyTorch Lightning and Hydra Aug 4, 2022
@rusty1s rusty1s changed the title [Roadmap] GraphGym via PyTorch Lightning and Hydra [Roadmap] GraphGym via PyTorch Lightning and Hydra 🚀 Aug 4, 2022
@rusty1s rusty1s self-assigned this Aug 4, 2022
@rusty1s rusty1s pinned this issue Aug 4, 2022
@rusty1s rusty1s assigned rusty1s and unassigned rusty1s Aug 4, 2022
@julian-q
Copy link

Integrate LightningDataset, LightningNodeData and LightningLinkData modules

New here: what do LightningNodeData and LightningLinkData refer to?

Refactor load_ckpt and save_ckpt with PL checkpoint save and load method

Is this still needed after #4689?

@rusty1s
Copy link
Member Author

rusty1s commented Sep 15, 2022

@julian-q Welcome :) LightningNodeDataset, LightningNodeData and LightningLinkData refer to our helper data modules to connect PyG with PL, see here. Currently, they are not used within GraphGym.

Is this still needed after #4689?

I assume so. load_ckpt and save_ckpt doesn't look like they currently make use of PL checkpoints.

@shenoynikhil
Copy link
Contributor

I would like to contribute to this task. I have previously worked on using pytorch lightning and hydra together in this repo.

@rusty1s
Copy link
Member Author

rusty1s commented Oct 25, 2022

This is amazing. We should collect some information about how we want to integrate Hydra into GraphGym, as I believe we need a new config layout. I have started something a long time ago but did not finish it, see here, here and here. Would very much appreciate some advice and insights from you!

@shenoynikhil
Copy link
Contributor

I'll spend sometime going through the links you shared and start a draft PR regarding this. Hope to get your guidance on it as well :).

@wsad1 wsad1 unpinned this issue Jan 27, 2023
@rajveer43
Copy link
Contributor

@rusty1s I would like to try this!

@rusty1s
Copy link
Member Author

rusty1s commented Aug 11, 2023

I would like to try this!

Cool :) We were sadly a bit lazy in the further development of GraphGym, so happy to see some activity back on this :)

@rajveer43
Copy link
Contributor

rajveer43 commented Aug 11, 2023

I would like to try this!

Cool :) We were sadly a bit lazy in the further development of GraphGym, so happy to see some activity back on this :)

Okay Would Work on this from Monday! I know how to code it.. would you just tell me where I can Exactly Put the code? locations of the file. which files to edit?

rusty1s added a commit that referenced this issue Sep 1, 2023
Part of #5132.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
@RagnarokAnsh
Copy link

@rusty1s is it still open? can i contribute?

@rusty1s
Copy link
Member Author

rusty1s commented Sep 21, 2023

This roadmap is in a fuzzy state right now, there exists a few PRs already like #5626 but I haven't really have time to merge this yet.

JakubPietrakIntel pushed a commit that referenced this issue Sep 27, 2023
Part of #5132.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants