Download a blob with Python

This article shows how to download a blob using the Azure Storage client library for Python. You can download blob data to various destinations, including a local file path, stream, or text string. You can also open a blob stream and read from it.

Prerequisites

  • This article assumes you already have a project set up to work with the Azure Blob Storage client library for Python. To learn about setting up your project, including package installation, adding import statements, and creating an authorized client object, see Get started with Azure Blob Storage and Python.
  • The authorization mechanism must have permissions to perform a download operation. To learn more, see the authorization guidance for the following REST API operation:

Download a blob

You can use the following method to download a blob:

The download_blob method returns a StorageStreamDownloader object. During a download, the client libraries split the download request into chunks, where each chunk is downloaded with a separate Get Blob range request. This behavior depends on the total size of the blob and how the data transfer options are set.

Download to a file path

The following example downloads a blob to a file path:

def download_blob_to_file(self, blob_service_client: BlobServiceClient, container_name):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")
    with open(file=os.path.join(r'filepath', 'filename'), mode="wb") as sample_blob:
        download_stream = blob_client.download_blob()
        sample_blob.write(download_stream.readall())

Download to a stream

The following example downloads a blob to a stream. In this example, StorageStreamDownloader.read_into downloads the blob contents to a stream and returns the number of bytes read:

def download_blob_to_stream(self, blob_service_client: BlobServiceClient, container_name):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")

    # readinto() downloads the blob contents to a stream and returns the number of bytes read
    stream = io.BytesIO()
    num_bytes = blob_client.download_blob().readinto(stream)
    print(f"Number of bytes: {num_bytes}")

Download a blob in chunks

The following example downloads a blob and iterates over chunks in the download stream. In this example, StorageStreamDownloader.chunks returns an iterator, which allows you to read the blob content in chunks:

def download_blob_chunks(self, blob_service_client: BlobServiceClient, container_name):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")

    # This returns a StorageStreamDownloader
    stream = blob_client.download_blob()
    chunk_list = []

    # Read data in chunks to avoid loading all into memory at once
    for chunk in stream.chunks():
        # Process your data (anything can be done here - 'chunk' is a byte array)
        chunk_list.append(chunk)

Download to a string

The following example downloads blob contents as text. In this example, the encoding parameter is necessary for readall() to return a string, otherwise it returns bytes:

def download_blob_to_string(self, blob_service_client: BlobServiceClient, container_name):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")

    # encoding param is necessary for readall() to return str, otherwise it returns bytes
    downloader = blob_client.download_blob(max_concurrency=1, encoding='UTF-8')
    blob_text = downloader.readall()
    print(f"Blob contents: {blob_text}")

Download a block blob with configuration options

You can define client library configuration options when downloading a blob. These options can be tuned to improve performance and enhance reliability. The following code examples show how to define configuration options for a download both at the method level, and at the client level when instantiating BlobClient. These options can also be configured for a ContainerClient instance or a BlobServiceClient instance.

Specify data transfer options on download

You can set configuration options when instantiating a client to optimize performance for data transfer operations. You can pass the following keyword arguments when constructing a client object in Python:

  • max_chunk_get_size - The maximum chunk size used for downloading a blob. Defaults to 4 MiB.
  • max_single_get_size - The maximum size for a blob to be downloaded in a single call. If the total blob size exceeds max_single_get_size, the remainder of the blob data is downloaded in chunks. Defaults to 32 MiB.

For download operations, you can also pass the max_concurrency argument when calling download_blob. This argument defines the maximum number of parallel connections for the download operation.

The following code example shows how to specify data transfer options when creating a BlobClient object, and how to download data using that client object. The values provided in this sample aren't intended to be a recommendation. To properly tune these values, you need to consider the specific needs of your app.

def download_blob_transfer_options(self, account_url: str, container_name: str, blob_name: str):
    # Create a BlobClient object with data transfer options for download
    blob_client = BlobClient(
        account_url=account_url, 
        container_name=container_name, 
        blob_name=blob_name,
        credential=DefaultAzureCredential(),
        max_single_get_size=1024*1024*32, # 32 MiB
        max_chunk_get_size=1024*1024*4 # 4 MiB
    )

    with open(file=os.path.join(r'file_path', 'file_name'), mode="wb") as sample_blob:
        download_stream = blob_client.download_blob(max_concurrency=2)
        sample_blob.write(download_stream.readall())

Resources

To learn more about how to download blobs using the Azure Blob Storage client library for Python, see the following resources.

REST API operations

The Azure SDK for Python contains libraries that build on top of the Azure REST API, allowing you to interact with REST API operations through familiar Python paradigms. The client library methods for downloading blobs use the following REST API operation:

Code samples

Client library resources