Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues on training model "_classcond" #3

Open
usernameisntavailableble opened this issue Oct 4, 2023 · 2 comments
Open

Issues on training model "_classcond" #3

usernameisntavailableble opened this issue Oct 4, 2023 · 2 comments

Comments

@usernameisntavailableble
Copy link

usernameisntavailableble commented Oct 4, 2023

Hey! I was trying to play with threeseqdel_classcond (or threeseqabs_classcond) however two problems came out.

  1. preprocessing the directory into a .npz is not possible, i.g.
python data/qd.py /the/output/dir/cat threeseqdel_classcond

raises errors

Traceback (most recent call last):
  File "SOMEPATH/chirodiff_test/data/qd.py", line 281, in <module>
    dummy_sample = ds[0]
  File "SOMEPATH/chirodiff_test/data/qd.py", line 140, in __getitem__
    return self.represent(self.get_sketch(i))
  File "SOMEPATHchirodiff_test/data/qd.py", line 201, in represent
    label = torch.tensor(sketch.label, dtype=torch.int64)
TypeError: an integer is required (got type NoneType)
  1. If i skip this step as suggested in README.md, and start the training. In the config.yml, I changed repr into threeseqdel_classcond. And for example, under the /the/output/dir/ I have two folders, cat and dog, seperately including some sketches, and therefore I setup num_classes: typing.Optional[int] = 2 in config.yml , as suggesed in ln 74 in main.py, and therefor I have root_dir in config.yml to be /the/output/dir/, and naturally the error becomes
SOMEPATH/pytorch_lightning/utilities/data.py:103: UserWarning: Total length of `CombinedLoader` across ranks is zero. Please make sure this was your intention.
  rank_zero_warn(
`Trainer.fit` stopped: No training batches.

I guess this could be easily fixed if you could remind some special settings/path selections specifically for the class condition?

Many thanks!

@dasayan05
Copy link
Owner

It's been a while I haven't touched this code. But I see what you are doing wrong in point 1. In case of class conditional, you have to have multiple class folders and you should point to the root folder, i.e. do this

python data/qd.py /the/output/dir/ threeseqdel_classcond
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^~~~~~~~~~~~

.. and make sure you have class folders under them which contain sketches of respective class

/the/output/dir/cat
/the/output/dir/dog
/the/output/dir/whatever

This creates the sketch.label property where the error is coming from. Maybe the readme isn't super clear about the class conditional part.

I never saw the second error but I wouldn't recommend doing the second way. The first way will work if everything is done right.

@usernameisntavailableble
Copy link
Author

Thank you for your last message. It totally worked!

Now I am training with the threeseqabs_classcond, 7 categories, each 50000 sketches. But the following question rises.

Traceback (most recent call last):
  File "SOMEPATH/chirodiff_test/main.py", line 944, in <module>
    cli = LightningCLI(
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/cli.py", line 359, in __init__
    self._run_subcommand(self.subcommand)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/cli.py", line 650, in _run_subcommand
    fn(**fn_kwargs)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 980, in _run
    results = self._run_stage()
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1023, in _run_stage
    self.fit_loop.run()
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 202, in run
    self.advance()
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 355, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 133, in run
    self.advance(data_fetcher)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 190, in advance
    batch = next(data_fetcher)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/loops/fetchers.py", line 126, in __next__
    batch = super().__next__()
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/loops/fetchers.py", line 58, in __next__
    batch = next(iterator)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/utilities/combined_loader.py", line 285, in __next__
    out = next(self._iterator)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/pytorch_lightning/utilities/combined_loader.py", line 65, in __next__
    out[i] = next(self.iterators[i])
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 633, in __next__
    data = self._next_data()
 File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
    return self._process_data(data)
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
    data.reraise()
  File "SOMEPATH/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/_utils.py", line 644, in reraise
    raise exception
IndexError: Caught IndexError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/mnt/qb/work/bethge/bkr863/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/mnt/qb/work/bethge/bkr863/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/mnt/qb/work/bethge/bkr863/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/mnt/qb/work/bethge/bkr863/conda_envs/cdiffnew/lib/python3.10/site-packages/torch/utils/data/dataset.py", line 298, in __getitem__
    return self.dataset[self.indices[idx]]
  File "/mnt/qb/work/bethge/bkr863/chirodiff_test/data/qd.py", line 130, in __getitem__
    return tuple(
  File "/mnt/qb/work/bethge/bkr863/chirodiff_test/data/qd.py", line 131, in <genexpr>
    torch.from_numpy(self.data[attr][i]) for attr in self.attrs
IndexError: index 348354 is out of bounds for axis 0 with size 348084

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants