Upload a block blob with Python

Artikkel
08/20/2024

This article shows how to upload a blob using the Azure Storage client library for Python. You can upload data to a block blob from a file path, a stream, a binary object, or a text string. You can also upload blobs with index tags.

To learn about uploading blobs using asynchronous APIs, see Upload blobs asynchronously.

Prerequisites

Azure subscription - create one for free
Azure storage account - create a storage account
Python 3.8+

Set up your environment

If you don't have an existing project, this section shows you how to set up a project to work with the Azure Blob Storage client library for Python. For more details, see Get started with Azure Blob Storage and Python.

To work with the code examples in this article, follow these steps to set up your project.

Install packages

Install the following packages using pip install:

pip install azure-storage-blob azure-identity

Add import statements

Add the following import statements:

import io
import os
import uuid
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient, ContainerClient, BlobBlock, BlobClient, StandardBlobTier

Authorization

The authorization mechanism must have the necessary permissions to upload a blob. For authorization with Microsoft Entra ID (recommended), you need Azure RBAC built-in role Storage Blob Data Contributor or higher. To learn more, see the authorization guidance for Put Blob (REST API) and Put Block (REST API).

Create a client object

To connect an app to Blob Storage, create an instance of BlobServiceClient. The following example shows how to create a client object using DefaultAzureCredential for authorization:

# TODO: Replace <storage-account-name> with your actual storage account name
account_url = "https://<storage-account-name>.blob.core.windows.net"
credential = DefaultAzureCredential()

# Create the BlobServiceClient object
blob_service_client = BlobServiceClient(account_url, credential=credential)

You can also create client objects for specific containers or blobs, either directly or from the BlobServiceClient object. To learn more about creating and managing client objects, see Create and manage client objects that interact with data resources.

Upload data to a block blob

To upload a blob using a stream or a binary object, use the following method:

upload_blob

This method creates a new blob from a data source with automatic chunking, meaning that the data source may be split into smaller chunks and uploaded. To perform the upload, the client library may use either Put Blob or a series of Put Block calls followed by Put Block List. This behavior depends on the overall size of the object and how the data transfer options are set.

Upload a block blob from a local file path

The following example uploads a file to a block blob using a BlobClient object:

def upload_blob_file(self, blob_service_client: BlobServiceClient, container_name: str):
    container_client = blob_service_client.get_container_client(container=container_name)
    with open(file=os.path.join('filepath', 'filename'), mode="rb") as data:
        blob_client = container_client.upload_blob(name="sample-blob.txt", data=data, overwrite=True)

Upload a block blob from a stream

The following example creates random bytes of data and uploads a BytesIO object to a block blob using a BlobClient object:

def upload_blob_stream(self, blob_service_client: BlobServiceClient, container_name: str):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")
    input_stream = io.BytesIO(os.urandom(15))
    blob_client.upload_blob(input_stream, blob_type="BlockBlob")

Upload binary data to a block blob

The following example uploads binary data to a block blob using a BlobClient object:

def upload_blob_data(self, blob_service_client: BlobServiceClient, container_name: str):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")
    data = b"Sample data for blob"

    # Upload the blob data - default blob type is BlockBlob
    blob_client.upload_blob(data, blob_type="BlockBlob")

Upload a block blob with index tags

The following example uploads a block blob with index tags:

def upload_blob_tags(self, blob_service_client: BlobServiceClient, container_name: str):
    container_client = blob_service_client.get_container_client(container=container_name)
    sample_tags = {"Content": "image", "Date": "2022-01-01"}
    with open(file=os.path.join('filepath', 'filename'), mode="rb") as data:
        blob_client = container_client.upload_blob(name="sample-blob.txt", data=data, tags=sample_tags)

Upload a block blob with configuration options

You can define client library configuration options when uploading a blob. These options can be tuned to improve performance, enhance reliability, and optimize costs. The following code examples show how to define configuration options for an upload both at the method level, and at the client level when instantiating BlobClient. These options can also be configured for a ContainerClient instance or a BlobServiceClient instance.

Specify data transfer options for upload

You can set configuration options when instantiating a client to optimize performance for data transfer operations. You can pass the following keyword arguments when constructing a client object in Python:

max_block_size - The maximum chunk size for uploading a block blob in chunks. Defaults to 4 MiB.
max_single_put_size - If the blob size is less than or equal to max_single_put_size, the blob is uploaded with a single Put Blob request. If the blob size is larger than max_single_put_size or unknown, the blob is uploaded in chunks using Put Block and committed using Put Block List. Defaults to 64 MiB.

For more information on transfer size limits for Blob Storage, see Scale targets for Blob storage.

For upload operations, you can also pass the max_concurrency argument when calling upload_blob. This argument defines the maximum number of parallel connections to use when the blob size exceeds 64 MiB.

The following code example shows how to specify data transfer options when creating a BlobClient object, and how to upload data using that client object. The values provided in this sample aren't intended to be a recommendation. To properly tune these values, you need to consider the specific needs of your app.

def upload_blob_transfer_options(self, account_url: str, container_name: str, blob_name: str):
    # Create a BlobClient object with data transfer options for upload
    blob_client = BlobClient(
        account_url=account_url, 
        container_name=container_name, 
        blob_name=blob_name,
        credential=DefaultAzureCredential(),
        max_block_size=1024*1024*4, # 4 MiB
        max_single_put_size=1024*1024*8 # 8 MiB
    )
    
    with open(file=os.path.join(r'file_path', blob_name), mode="rb") as data:
        blob_client = blob_client.upload_blob(data=data, overwrite=True, max_concurrency=2)

To learn more about tuning data transfer options, see Performance tuning for uploads and downloads with Python.

Set a blob's access tier on upload

You can set a blob's access tier on upload by passing the standard_blob_tier keyword argument to upload_blob. Azure Storage offers different access tiers so that you can store your blob data in the most cost-effective manner based on how it's being used.

The following code example shows how to set the access tier when uploading a blob:

def upload_blob_access_tier(self, blob_service_client: BlobServiceClient, container_name: str, blob_name: str):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)
    
    #Upload blob to the cool tier
    with open(file=os.path.join(r'file_path', blob_name), mode="rb") as data:
        blob_client = blob_client.upload_blob(data=data, overwrite=True, standard_blob_tier=StandardBlobTier.COOL)

Setting the access tier is only allowed for block blobs. You can set the access tier for a block blob to Hot, Cool, Cold, or Archive. To set the access tier to Cold, you must use a minimum client library version of 12.15.0.

To learn more about access tiers, see Access tiers overview.

Upload a block blob by staging blocks and committing

You can have greater control over how to divide uploads into blocks by manually staging individual blocks of data. When all of the blocks that make up a blob are staged, you can commit them to Blob Storage.

Use the following method to create a new block to be committed as part of a blob:

stage_block

Use the following method to write a blob by specifying the list of block IDs that make up the blob:

commit_block_list

The following example reads data from a file and stages blocks to be committed as part of a blob:

def upload_blocks(self, blob_container_client: ContainerClient, local_file_path: str, block_size: int):
    file_name = os.path.basename(local_file_path)
    blob_client = blob_container_client.get_blob_client(file_name)

    with open(file=local_file_path, mode="rb") as file_stream:
        block_id_list = []

        while True:
            buffer = file_stream.read(block_size)
            if not buffer:
                break

            block_id = uuid.uuid4().hex
            block_id_list.append(BlobBlock(block_id=block_id))

            blob_client.stage_block(block_id=block_id, data=buffer, length=len(buffer))

        blob_client.commit_block_list(block_id_list)

Upload blobs asynchronously

The Azure Blob Storage client library for Python supports uploading blobs asynchronously. To learn more about project setup requirements, see Asynchronous programming.

Follow these steps to upload a blob using asynchronous APIs:

Add the following import statements:

import asyncio

from azure.identity.aio import DefaultAzureCredential
from azure.storage.blob.aio import BlobServiceClient, BlobClient, ContainerClient

Add code to run the program using asyncio.run. This function runs the passed coroutine, main() in our example, and manages the asyncio event loop. Coroutines are declared with the async/await syntax. In this example, the main() coroutine first creates the top level BlobServiceClient using async with, then calls the method that uploads the blob. Note that only the top level client needs to use async with, as other clients created from it share the same connection pool.

async def main():
    sample = BlobSamples()

    # TODO: Replace <storage-account-name> with your actual storage account name
    account_url = "https://<storage-account-name>.blob.core.windows.net"
    credential = DefaultAzureCredential()

    async with BlobServiceClient(account_url, credential=credential) as blob_service_client:
        await sample.upload_blob_file(blob_service_client, "sample-container")

if __name__ == '__main__':
    asyncio.run(main())

Add code to upload the blob. The following example uploads a blob from a local file path using a ContainerClient object. The code is the same as the synchronous example, except that the method is declared with the async keyword and the await keyword is used when calling the upload_blob method.

async def upload_blob_file(self, blob_service_client: BlobServiceClient, container_name: str):
    container_client = blob_service_client.get_container_client(container=container_name)
    with open(file=os.path.join('filepath', 'filename'), mode="rb") as data:
        blob_client = await container_client.upload_blob(name="sample-blob.txt", data=data, overwrite=True)

With this basic setup in place, you can implement other examples in this article as coroutines using async/await syntax.

Resources

To learn more about uploading blobs using the Azure Blob Storage client library for Python, see the following resources.

Code samples

View synchronous or asynchronous code samples from this article (GitHub)

REST API operations

The Azure SDK for Python contains libraries that build on top of the Azure REST API, allowing you to interact with REST API operations through familiar Python paradigms. The client library methods for uploading blobs use the following REST API operations:

Put Blob (REST API)
Put Block (REST API)

Del via

Upload a block blob with Python

Prerequisites

Set up your environment

Install packages

Add import statements

Authorization

Create a client object

Upload data to a block blob

Upload a block blob from a local file path

Upload a block blob from a stream

Upload binary data to a block blob

Upload a block blob with index tags

Upload a block blob with configuration options

Specify data transfer options for upload

Set a blob's access tier on upload

Upload a block blob by staging blocks and committing

Upload blobs asynchronously

Resources

Code samples

REST API operations

Client library resources

See also

Tilbakemeldinger

Flere ressurser

Del via

Upload a block blob with Python

Prerequisites

Set up your environment

Install packages

Add import statements

Authorization

Create a client object

Upload data to a block blob

Upload a block blob from a local file path

Upload a block blob from a stream

Upload binary data to a block blob

Upload a block blob with index tags

Upload a block blob with configuration options

Specify data transfer options for upload

Set a blob's access tier on upload

Upload a block blob by staging blocks and committing

Upload blobs asynchronously

Resources

Code samples

REST API operations

Client library resources

See also

Related content

Tilbakemeldinger

Flere ressurser