Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: db_dir must be specified #9

Open
usanzhu opened this issue Apr 16, 2023 · 4 comments
Open

AssertionError: db_dir must be specified #9

usanzhu opened this issue Apr 16, 2023 · 4 comments

Comments

@usanzhu
Copy link

usanzhu commented Apr 16, 2023

您好,我在尝试运行cpsc2021的数据集时,出现上述问题,请问该如何解决呢?

@wenh06
Copy link
Collaborator

wenh06 commented Apr 17, 2023

Was the error raised when using torch_ecg.databases.CPSC2021 or torch_ecg.databases.datasets.CPSC2021Dataset?

Currently when passing empty db_dir, a warning instead of an error should be raised, for example

from torch_ecg.databases import CPSC2021

dr = CPSC2021()

TorchECG-CPSC2021 - INFO - Please wait patiently to let the reader aggregate statistics on the whole dataset...
TorchECG-CPSC2021 - INFO - Done in 0.00651 seconds!
TorchECG-CPSC2021 - INFO - Please wait several minutes patiently to let the reader list records for each diagnosis...
TorchECG-CPSC2021 - INFO - Done in 0.00097 seconds!
/home/wenh06/Jupyter/wenhao/workspace/torch_ecg/torch_ecg/databases/base.py:161: RuntimeWarning: db_dir is not specified, using default /home/wenh06/.cache/torch_ecg/data/cpsc2021 as the storage path
warnings.warn(
/home/wenh06/Jupyter/wenhao/workspace/torch_ecg/torch_ecg/databases/base.py:169: RuntimeWarning: /home/wenh06/.cache/torch_ecg/data/cpsc2021 does not exist. It is now created. Please check if it is set correctly. Or if you may want to download the database into this folder, please use the download() method.
warnings.warn(

@wenh06
Copy link
Collaborator

wenh06 commented Apr 19, 2023

I might have found the problem. One has to set db_dir in the config class instance for torch_ecg.databases.datasets.CPSC2021Dataset, which is None by default.

@usanzhu
Copy link
Author

usanzhu commented Apr 19, 2023

Thank you, I solved this problem. But there are two more problems come up:

One is that :

train_config.main.loss_kw = ED(gamma_pos=0, gamma_neg=1, implementation="deep-psp")

In this line, the function 'ED' is not defined.

Another one comes up when I use the sample-data cpsc2021 you offered in the document, like this:

ValueError: a must be a sequence or an integer, not <class 'set'>

It appears when the code runs to this line:
ds_train = CPSC2021(TrainCfg, training=True, task=task, lazy=False)

which is splitting the dataset into test and training sets:
1297 DEFAULTS.RNG_sample( 1298 afp_subjects, round(len(afp_subjects) * _test_ratio / 100) 1299 ).tolist()

I think if I use my own dataset, this problem may occur again, so I would appreciate it if you could help me out with this.

@wenh06
Copy link
Collaborator

wenh06 commented Apr 19, 2023

ED is for easydict.EasyDict which was previously used as a configuration class, and has already been replaced with torch_ecg.cfg.CFG. I plan to replace torch_ecg.cfg.CFG with dataclass because it still has bugs that are hard to fix, but I do not have ideas on how to do the replacement.

The second error occurs because the first argument of DEFAULTS.RNG_sample can not be a set. This is fixed in de53269. This bug has been fixed in cpsc2021_dataset.py but left untreated in the train_hybrid_cpsc2021 benchmark study.

I should make a plan to update the benchmark studies and corresponding test files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants