Skip to content

Commit

Permalink
[Fault Tolerance] Don't check the len of a dataset, but its instance. (
Browse files Browse the repository at this point in the history
…#10432)

Co-authored-by: Thomas Chaton <thomas@grid.ai>
  • Loading branch information
awaelchli and tchaton committed Nov 9, 2021
1 parent 295f62f commit aae797f
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 3 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Fixed the logging with `on_step=True` in epoch-level hooks causing unintended side-effects. Logging with `on_step=True` in epoch-level hooks will now correctly raise an error ([#10409](https://github.com/PyTorchLightning/pytorch-lightning/pull/10409))
- Fixed deadlocks for distributed training with `RichProgressBar` ([#10428](https://github.com/PyTorchLightning/pytorch-lightning/pull/10428))
- Fixed an issue where the model wrapper in Lite converted non-floating point tensors to float ([#10429](https://github.com/PyTorchLightning/pytorch-lightning/pull/10429))
- Fixed an issue with inferring the dataset type in fault-tolerant training ([#10432](https://github.com/PyTorchLightning/pytorch-lightning/pull/10432))


## [1.5.0] - 2021-11-02
Expand Down
7 changes: 4 additions & 3 deletions pytorch_lightning/trainer/data_loading.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
CaptureMapDataset,
FastForwardSampler,
)
from pytorch_lightning.utilities.data import has_iterable_dataset, has_len_all_ranks
from pytorch_lightning.utilities.data import get_len, has_iterable_dataset, has_len_all_ranks
from pytorch_lightning.utilities.enums import DistributedType
from pytorch_lightning.utilities.exceptions import MisconfigurationException
from pytorch_lightning.utilities.imports import _fault_tolerant_training
Expand Down Expand Up @@ -282,10 +282,11 @@ def _get_dataloader_init_kwargs(
dl_kwargs["sampler"] = None

if _fault_tolerant_training():
if isinstance(dl_kwargs["dataset"], IterableDataset):
dataset = dl_kwargs["dataset"]
if isinstance(dataset, IterableDataset):
# wrap the `IterableDataset` into a `CaptureIterableDataset` to record sampler states.
dl_kwargs["dataset"] = CaptureIterableDataset(dataset=dl_kwargs["dataset"])
elif len(dl_kwargs["dataset"]):
elif get_len(dataset) != float("inf"):
dl_kwargs["dataset"] = CaptureMapDataset(dataset=dl_kwargs["dataset"])
else:
raise MisconfigurationException(
Expand Down

0 comments on commit aae797f

Please sign in to comment.