### Describe the bug Running ``` import datasets ds = datasets.load_dataset('bigcode/the-stack-dedup', streaming=True) ``` takes about 2.5 minutes! I would expect this to be near instantaneous. With other datasets, the runtime is one or two seconds. ### Environment info - `datasets` version: 2.11.0 - Platform: macOS-13.3.1-arm64-arm-64bit - Python version: 3.10.10 - Huggingface_hub version: 0.13.4 - PyArrow version: 11.0.0 - Pandas version: 2.0.0