Skip to content

Loading partial dataset when debugging #2538

@reachtarunhere

Description

@reachtarunhere

I am using PyTorch Lightning along with datasets (thanks for so many datasets already prepared and the great splits).

Every time I execute load_dataset for the imdb dataset it takes some time even if I specify a split involving very few samples. I guess this due to hashing as per the other issues.

Is there a way to only load part of the dataset on load_dataset? This would really speed up my workflow.
Something like a debug mode would really help. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions