Closed
Description
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Medium
Please provide a clear description of problem this feature solves
Running into the following with the SID training notebook:
- Training dataset path is invalid and needs to be updated.
- Model path is invalid and needs to be updated.
- Getting this error from training cell:
AttributeError: 'BertEncoder' object has no attribute 'gradient_checkpointing'
Running into the following with the SID training script:
- Training dataset path in example usage is invalid and needs to be updated.
- Model path in example usage is invalid and needs to be updated.
- Getting this when running script:
File "sid-minibert-20211021-script.py", line 31, in <module>
from sklearn.metrics import (f1_score, accuracy_score,
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/sklearn/__init__.py", line 80, in <module>
from .base import clone
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/sklearn/base.py", line 21, in <module>
from .utils import _IS_32BIT
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/sklearn/utils/__init__.py", line 23, in <module>
from .class_weight import compute_class_weight, compute_sample_weight
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/sklearn/utils/class_weight.py", line 7, in <module>
from .validation import _deprecate_positional_args
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/sklearn/utils/validation.py", line 26, in <module>
from .fixes import _object_dtype_isnan
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/sklearn/utils/fixes.py", line 18, in <module>
import scipy.stats
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/stats/__init__.py", line 467, in <module>
from ._stats_py import *
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/stats/_stats_py.py", line 46, in <module>
from . import distributions
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/stats/distributions.py", line 8, in <module>
from ._distn_infrastructure import (rv_discrete, rv_continuous, rv_frozen)
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/stats/_distn_infrastructure.py", line 24, in <module>
from scipy import optimize
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/__init__.py", line 211, in __getattr__
return _importlib.import_module(f'scipy.{name}')
File "/opt/conda/envs/morpheus/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/optimize/__init__.py", line 413, in <module>
from ._linprog import linprog, linprog_verbose_callback
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/optimize/_linprog.py", line 21, in <module>
from ._linprog_highs import _linprog_highs
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/scipy/optimize/_linprog_highs.py", line 20, in <module>
from ._highs._highs_wrapper import _highs_wrapper
The above error goes away if the sklearn
import is moved before the torch
imports but then this new error is seen:
Data Preprocessing...
Traceback (most recent call last):
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/cudf/utils/utils.py", line 256, in __getattr__
return self[key]
File "/opt/conda/envs/morpheus/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/cudf/core/dataframe.py", line 1178, in __getitem__
return self._get_columns_by_label(arg, downcast=True)
File "/opt/conda/envs/morpheus/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/cudf/core/dataframe.py", line 1914, in _get_columns_by_label
new_data = super()._get_columns_by_label(labels, downcast)
File "/opt/conda/envs/morpheus/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/cudf/core/frame.py", line 411, in _get_columns_by_label
return self._data.select_by_label(labels)
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/cudf/core/column_accessor.py", line 343, in select_by_label
return self._select_by_label_grouped(key)
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/cudf/core/column_accessor.py", line 463, in _select_by_label_grouped
result = self._grouped_data[key]
KeyError: 'text'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "sid-minibert-20211021-script.py", line 240, in <module>
main()
File "sid-minibert-20211021-script.py", line 217, in main
train_dataloader, val_dataloader, idx2label = data_preprocessing(
File "sid-minibert-20211021-script.py", line 62, in data_preprocessing
tokenizer_output = cased_tokenizer(df.text,
File "/opt/conda/envs/morpheus/lib/python3.8/site-packages/cudf/utils/utils.py", line 258, in __getattr__
raise AttributeError(
AttributeError: DataFrame object has no attribute text
Describe your ideal solution
Errors may be result of not using same versions of torch
and transformers
from when notebook and script were created. One solution could be to document those versions so they can be installed before running. Alternatively, notebook and script can be updated to work with torch
version included with Morpheus.
Describe any alternatives you have considered
No response
Additional context
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
- I have searched the open feature requests and have found no duplicates for this feature request
Metadata
Assignees
Labels
Type
Projects
Status
Done