Skip to content

[BUG] dask_cudf.read_json seems to be freezing when given a path with large number of files #12951

Open
@VibhuJawa

Description

@VibhuJawa

Describe the bug

dask_cudf.read_json seems to be freezing when given a path with large number of files. Providing the list of files directly works

Below seems to be freezing

text_ddf = dask_cudf.read_json(f'{INPUT_PATH}/data/*',engine='cudf',lines=True)

Below works

files = list(map(lambda x: os.path.join(data_path, x), os.listdir(data_path)))

text_ddf = dask_cudf.read_json(files,engine='cudf',lines=True)

Additional context
CC: @ayushdg

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcuIOcuIO issuedaskDask issuelibcudfAffects libcudf (C++/CUDA) code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions