-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixed output scaling bug and removed double residual edges (was already implemented in global architecture) #482
Conversation
Codecov Report
@@ Coverage Diff @@
## main #482 +/- ##
==========================================
- Coverage 71.45% 71.45% -0.01%
==========================================
Files 93 93
Lines 8520 8512 -8
==========================================
- Hits 6088 6082 -6
+ Misses 2432 2430 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're accidentally squaring the activations, which I don't think we want to do
graphium/nn/pyg_layers/gps_pyg.py
Outdated
h_local *= 1 / self.scale_activations(h_local, self.output_scale) | ||
h_local *= self.scale_activations(h_local, self.output_scale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scale_activations
returns h_local * scale_factor
, so this is equivalent to h_local = h_local ** 2 * output_scale
, which I don't think we want.
Also, I don't understand why we need separate functions for residual_add
and scale_activations
? It seems like overkill to me to have functions that only do one operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True we actually only want h_local = self.scale_activations(h_local, self.output_scale)
It was initially implemented as separate functions, don't think there is a specific benefit to it, but also don't think it hurts to be separate
Changelogs
Checklist:
feature
,fix
ortest
(or ask a maintainer to do it for you).discussion related to that PR