Support for parallel blob downloads

**Is your feature request related to a problem? Please describe.**

I have a large list of blobs I want to download using the azure SDK. I can do a loop over the list and call the client's `download_blob` for each blob sequentially but this is very slow.


I implemented a class derived from `azure.storage.blob.ContainerClient` that uses `ThreadPoolExecutor` to do the downloads in parallel, with a new method with this interface:

```
    def download_blobs_to_files(
        self,
        blob_filename_pairs: Iterable[Tuple[str, str]],
        concurrency_limit: int = 1000,
        verbose: bool = False,
    ) -> int:
        """Downloads a list of files from an azure blob container.

        Args:
            blob_filename_pairs: List[Tuple[str, str]]:List of blob and local path pairs
            concurrency_limit: Maximum number of threads.
            verbose: controls verbosity of the function.
```

It works  is it a bit brittle and it is not clear how to automatically choose the right number of threads (`concurrency_limit`). Ideally this would be a feature supported by the Azure SDK. It seems to me to be a frequent user need.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for parallel blob downloads #40270

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for parallel blob downloads #40270

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions