-
Notifications
You must be signed in to change notification settings - Fork 215
[ENH] Adding Time Mixingup Contrastive Learning to Self Supervised module #3015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
The value that controls the logits smoothness. | ||
backbone_network : aeon Network, default = None | ||
The backbone network used for the SSL model, | ||
it can be any network from the aeon.networks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very "squished" i.e. it doesnt go all the way to the margin (wrapping too early) which is just adding lots of unnecessary lines to the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed thanks
Parameters | ||
---------- | ||
alpha : float, default = 0.2 | ||
The alpha value for the Beta distribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think saying alpha is "the alpha value for..." is very descriptive. What does it do in the beta distribution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added an explanation
module on condition for it's structure to be | ||
configured as "encoder", see _config attribute. | ||
For TimeMCL, the default network used is | ||
FCNNetwork(n_layers=3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing "n_filters=[128, 256, 128]" from default implementation that is specified on 208
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks fixed
The number of samples per gradient update. | ||
use_mini_batch_size : bool, default = False | ||
Whether or not to use the mini batch size formula. | ||
n_epochs : int, default = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missmatch docstring and code default value which is n_epochs = 2000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
callbacks: Callback | list[Callback] | None = None, | ||
batch_size: int = 64, | ||
use_mini_batch_size: bool = False, | ||
n_epochs: int = 2000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Miss match docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
epoch trained, using the base class method | ||
save_last_model_to_file. | ||
save_init_model : bool, default = False | ||
Whether to save the initialization of the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double space before " model"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
callbacks : keras callback or list of callbacks, | ||
default = None | ||
The default list of callbacks are set to | ||
ModelCheckpoint and ReduceLROnPlateau. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fairly ambiguous about what this actually does. Could we have a more detailed explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added an explanation
|
||
return model | ||
|
||
def _mixup_loss(self, z1, z2, z_augmented, contrastive_weight): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this function correctly it returns the mean and not the per-sample losses. Maybe this should be renamed to something that reflects it is the _mixup_mean not the _mixup_loss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks didnt notice i call reduce mean inside the funciton, i removed it from the inside the function's return and kept it when inside _fit when function is being called, hence kept the "loss" name
"The parameter backbone_network", "should be an aeon network." | ||
) | ||
|
||
X = X.transpose(0, 2, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be nice to have some input validation for alpha > 0 and mixup_temperature > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
------- | ||
output : a compiled Keras Model | ||
""" | ||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy doesnt need to be in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
Adding a state of the art model TimeMCL [1] to populate the self supervised model, after this will probably add a base model to contain the load/save functions
The model takes as input 2 time series, with no informaiton on the labels, augments a third series by mixing up x1 and x2 as x3 = lamda.x1 + (1-lamda).x2 and the model uses a contrastive learning loss to match again the augmented x3 sample to x1 and x2 and discard the rest of the series in the batch
Original code is in torch i adapted it to tensorflow keras
[1] Wickstrøm, Kristoffer, Michael Kampffmeyer, Karl Øyvind Mikalsen, and Robert Jenssen. "Mixing up contrastive learning: Self-supervised representation learning for time series." Pattern Recognition Letters 155 (2022): 54-61.