You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Version
See the console output for PyABSA, Torch, Transformers Version
PyABSA: 2.4.0
Torch: 2.0.1
transformers: 4.31.0
Describe the bug
A clear and concise description of what the bug is.
In the below sample code under the folder of examples-v2/aspect_term_extraction
trainer = ATEPC.ATEPCTrainer(
config=config,
dataset=dataset,
from_checkpoint="english", # if you want to resume training from our pretrained checkpoints, you can pass the checkpoint name here
auto_device=DeviceTypeOption.AUTO, # use cuda if available
checkpoint_save_mode=ModelSaveOption.SAVE_MODEL_STATE_DICT, # save state dict only instead of the whole model
load_aug=False, # there are some augmentation dataset for integrated datasets, you use them by setting load_aug=True to improve performance
)
When running the above code, it will have the below error:
File /databricks/conda/lib/python3.9/site-packages/torch/nn/modules/module.py:2041, in Module.load_state_dict(self, state_dict, strict)
2036 error_msgs.insert(
2037 0, 'Missing key(s) in state_dict: {}. '.format(
2038 ', '.join('"{}"'.format(k) for k in missing_keys)))
2040 if len(error_msgs) > 0:
-> 2041 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
2042 self.class.name, "\n\t".join(error_msgs)))
2043 return _IncompatibleKeys(missing_keys, unexpected_keys)
RuntimeError: Error(s) in loading state_dict for FAST_LCF_ATEPC:
Unexpected key(s) in state_dict: "bert4global.embeddings.position_ids".
However, if I commented the line of "from_checkpoint="english", ", then it works well.
Code To Reproduce
rainer = ATEPC.ATEPCTrainer(
config=config,
dataset=dataset,
from_checkpoint="english", # if you want to resume training from our pretrained checkpoints, you can pass the checkpoint name here
auto_device=DeviceTypeOption.AUTO, # use cuda if available
checkpoint_save_mode=ModelSaveOption.SAVE_MODEL_STATE_DICT, # save state dict only instead of the whole model
load_aug=False, # there are some augmentation dataset for integrated datasets, you use them by setting load_aug=True to improve performance
)
Expected behavior
Could you please troubleshoot this bug? I think resume training from your pretrained checkpoints such as "english" is very important. Otherwise, if training from scratch will generate an underperformed model.
Yep, I know disable the checkpoint resuming can work, but wondering how can we resume the checkpoint (as it is a very important feature in some situtations) ? Thanks in advance if you could help with the bug?
Version
See the console output for PyABSA, Torch, Transformers Version
PyABSA: 2.4.0
Torch: 2.0.1
transformers: 4.31.0
Describe the bug
A clear and concise description of what the bug is.
In the below sample code under the folder of examples-v2/aspect_term_extraction
trainer = ATEPC.ATEPCTrainer(
config=config,
dataset=dataset,
from_checkpoint="english", # if you want to resume training from our pretrained checkpoints, you can pass the checkpoint name here
auto_device=DeviceTypeOption.AUTO, # use cuda if available
checkpoint_save_mode=ModelSaveOption.SAVE_MODEL_STATE_DICT, # save state dict only instead of the whole model
load_aug=False, # there are some augmentation dataset for integrated datasets, you use them by setting load_aug=True to improve performance
)
When running the above code, it will have the below error:
File /databricks/conda/lib/python3.9/site-packages/torch/nn/modules/module.py:2041, in Module.load_state_dict(self, state_dict, strict)
2036 error_msgs.insert(
2037 0, 'Missing key(s) in state_dict: {}. '.format(
2038 ', '.join('"{}"'.format(k) for k in missing_keys)))
2040 if len(error_msgs) > 0:
-> 2041 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
2042 self.class.name, "\n\t".join(error_msgs)))
2043 return _IncompatibleKeys(missing_keys, unexpected_keys)
RuntimeError: Error(s) in loading state_dict for FAST_LCF_ATEPC:
Unexpected key(s) in state_dict: "bert4global.embeddings.position_ids".
However, if I commented the line of "from_checkpoint="english", ", then it works well.
Code To Reproduce
rainer = ATEPC.ATEPCTrainer(
config=config,
dataset=dataset,
from_checkpoint="english", # if you want to resume training from our pretrained checkpoints, you can pass the checkpoint name here
auto_device=DeviceTypeOption.AUTO, # use cuda if available
checkpoint_save_mode=ModelSaveOption.SAVE_MODEL_STATE_DICT, # save state dict only instead of the whole model
load_aug=False, # there are some augmentation dataset for integrated datasets, you use them by setting load_aug=True to improve performance
)
Expected behavior
Could you please troubleshoot this bug? I think resume training from your pretrained checkpoints such as "english" is very important. Otherwise, if training from scratch will generate an underperformed model.
Thank you very much in advance!
The above example code is available at:
https://github.com/yangheng95/PyABSA/blob/v2/examples-v2/aspect_term_extraction/Aspect_Term_Extraction.ipynb
Screenshots
The text was updated successfully, but these errors were encountered: