Copy a blob with asynchronous scheduling using Python
This article shows how to copy a blob with asynchronous scheduling using the Azure Storage client library for Python. You can copy a blob from a source within the same storage account, from a source in a different storage account, or from any accessible object retrieved via HTTP GET request on a given URL. You can also abort a pending copy operation.
The client library methods covered in this article use the Copy Blob REST API operation, and can be used when you want to perform a copy with asynchronous scheduling. For most copy scenarios where you want to move data into a storage account and have a URL for the source object, see Copy a blob from a source object URL with Python.
Prerequisites
- Azure subscription - create one for free
- Azure storage account - create a storage account
- Python 3.8+
Set up your environment
If you don't have an existing project, this section shows you how to set up a project to work with the Azure Blob Storage client library for Python. For more details, see Get started with Azure Blob Storage and Python.
To work with the code examples in this article, follow these steps to set up your project.
Install packages
Install the following packages using pip install
:
pip install azure-storage-blob azure-identity
Add import statements
Add the following import
statements:
import datetime
from azure.identity import DefaultAzureCredential
from azure.storage.blob import (
BlobServiceClient,
BlobClient,
BlobLeaseClient,
BlobSasPermissions,
generate_blob_sas
)
Authorization
The authorization mechanism must have the necessary permissions to perform a copy operation, or to abort a pending copy. For authorization with Microsoft Entra ID (recommended), the least privileged Azure RBAC built-in role varies based on several factors. To learn more, see the authorization guidance for Copy Blob (REST API) or Abort Copy Blob (REST API).
Create a client object
To connect an app to Blob Storage, create an instance of BlobServiceClient. The following example shows how to create a client object using DefaultAzureCredential
for authorization:
# TODO: Replace <storage-account-name> with your actual storage account name
account_url = "https://<storage-account-name>.blob.core.windows.net"
credential = DefaultAzureCredential()
# Create the BlobServiceClient object
blob_service_client = BlobServiceClient(account_url, credential=credential)
You can also create client objects for specific containers or blobs, either directly or from the BlobServiceClient
object. To learn more about creating and managing client objects, see Create and manage client objects that interact with data resources.
About copying blobs with asynchronous scheduling
The Copy Blob
operation can finish asynchronously and is performed on a best-effort basis, which means that the operation isn't guaranteed to start immediately or complete within a specified time frame. The copy operation is scheduled in the background and performed as the server has available resources. The operation can complete synchronously if the copy occurs within the same storage account.
A Copy Blob
operation can perform any of the following actions:
- Copy a source blob to a destination blob with a different name. The destination blob can be an existing blob of the same blob type (block, append, or page), or it can be a new blob created by the copy operation.
- Copy a source blob to a destination blob with the same name, which replaces the destination blob. This type of copy operation removes any uncommitted blocks and overwrites the destination blob's metadata.
- Copy a source file in the Azure File service to a destination blob. The destination blob can be an existing block blob, or can be a new block blob created by the copy operation. Copying from files to page blobs or append blobs isn't supported.
- Copy a snapshot over its base blob. By promoting a snapshot to the position of the base blob, you can restore an earlier version of a blob.
- Copy a snapshot to a destination blob with a different name. The resulting destination blob is a writeable blob and not a snapshot.
The source blob for a copy operation may be one of the following types: block blob, append blob, page blob, blob snapshot, or blob version. The copy operation always copies the entire source blob or file. Copying a range of bytes or set of blocks isn't supported.
If the destination blob already exists, it must be of the same blob type as the source blob, and the existing destination blob is overwritten. The destination blob can't be modified while a copy operation is in progress, and a destination blob can only have one outstanding copy operation.
To learn more about the Copy Blob
operation, including information about properties, index tags, metadata, and billing, see Copy Blob remarks.
Copy a blob with asynchronous scheduling
This section gives an overview of methods provided by the Azure Storage client library for Python to perform a copy operation with asynchronous scheduling.
The following methods wrap the Copy Blob REST API operation, and begin an asynchronous copy of data from the source blob:
The start_copy_from_url
returns a dictionary containing copy_status and copy_id. The copy_status property is success if the copy completed synchronously or pending if the copy has been started asynchronously.
Copy a blob from a source within Azure
If you're copying a blob within the same storage account, the operation can complete synchronously. Access to the source blob can be authorized via Microsoft Entra ID, a shared access signature (SAS), or an account key. For an alterative synchronous copy operation, see Copy a blob from a source object URL with Python.
If the copy source is a blob in a different storage account, the operation can complete asynchronously. The source blob must either be public or authorized via SAS token. The SAS token needs to include the Read ('r') permission. To learn more about SAS tokens, see Delegate access with shared access signatures.
The following example shows a scenario for copying a source blob from a different storage account with asynchronous scheduling. In this example, we create a source blob URL with an appended user delegation SAS token. The example shows how to generate the SAS token using the client library, but you can also provide your own. The example also shows how to lease the source blob during the copy operation to prevent changes to the blob from a different client. The Copy Blob
operation saves the ETag
value of the source blob when the copy operation starts. If the ETag
value is changed before the copy operation finishes, the operation fails.
def copy_from_source_in_azure_async(self, source_blob: BlobClient, destination_blob: BlobClient, blob_service_client: BlobServiceClient):
# Lease the source blob during copy to prevent other clients from modifying it
lease = BlobLeaseClient(client=source_blob)
sas_token = self.generate_user_delegation_sas(blob_service_client=blob_service_client, source_blob=source_blob)
source_blob_sas_url = source_blob.url + "?" + sas_token
# Create an infinite lease by passing -1 as the lease duration
lease.acquire(lease_duration=-1)
# Start the copy operation - specify False for the requires_sync parameter
copy_operation = dict()
copy_operation = destination_blob.start_copy_from_url(source_url=source_blob_sas_url, requires_sync=False)
# If start_copy_from_url returns copy_status of 'pending', the operation has been started asynchronously
# You can optionally add logic here to wait for the copy operation to complete
# Release the lease on the source blob
lease.break_lease()
def generate_user_delegation_sas(self, blob_service_client: BlobServiceClient, source_blob: BlobClient):
# Get a user delegation key
delegation_key_start_time = datetime.datetime.now(datetime.timezone.utc)
delegation_key_expiry_time = delegation_key_start_time + datetime.timedelta(hours=1)
key = blob_service_client.get_user_delegation_key(
key_start_time=delegation_key_start_time,
key_expiry_time=delegation_key_expiry_time
)
# Create a SAS token that's valid for one hour, as an example
sas_token = generate_blob_sas(
account_name=blob_service_client.account_name,
container_name=source_blob.container_name,
blob_name=source_blob.blob_name,
account_key=None,
user_delegation_key=key,
permission=BlobSasPermissions(read=True),
expiry=datetime.datetime.now(datetime.timezone.utc) + datetime.timedelta(hours=1),
start=datetime.datetime.now(datetime.timezone.utc)
)
return sas_token
Note
User delegation SAS tokens offer greater security, as they're signed with Microsoft Entra credentials instead of an account key. To create a user delegation SAS token, the Microsoft Entra security principal needs appropriate permissions. For authorization requirements, see Get User Delegation Key.
Copy a blob from a source outside of Azure
You can perform a copy operation on any source object that can be retrieved via HTTP GET request on a given URL, including accessible objects outside of Azure. The following example shows a scenario for copying a blob from an accessible source object URL.
def copy_from_external_source_async(self, source_url: str, destination_blob: BlobClient):
# Start the copy operation - specify False for the requires_sync parameter
copy_operation = dict()
copy_operation = destination_blob.start_copy_from_url(source_url=source_url, requires_sync=False)
# If start_copy_from_url returns copy_status of 'pending', the operation has been started asynchronously
# You can optionally add logic here to wait for the copy operation to complete
Check the status of a copy operation
To check the status of an asynchronous Copy Blob
operation, you can poll the get_blob_properties method and check the copy status.
The following code example shows how to check the status of a pending copy operation:
def check_copy_status(self, destination_blob: BlobClient):
# Get the copy status from the destination blob properties
copy_status = destination_blob.get_blob_properties().copy.status
return copy_status
Abort a copy operation
Aborting a pending Copy Blob
operation results in a destination blob of zero length. However, the metadata for the destination blob has the new values copied from the source blob or set explicitly during the copy operation. To keep the original metadata from before the copy, make a snapshot of the destination blob before calling one of the copy methods.
To abort a pending copy operation, call the following operation:
This method wraps the Abort Copy Blob REST API operation, which cancels a pending Copy Blob
operation. The following code example shows how to abort a pending Copy Blob
operation:
def abort_copy(self, destination_blob: BlobClient):
# Get the copy operation details from the destination blob properties
copy_status = destination_blob.get_blob_properties().copy.status
copy_id = destination_blob.get_blob_properties().copy.id
# Check the copy status and abort if pending
if copy_status == 'pending':
destination_blob.abort_copy(copy_id)
print(f"Copy operation {copy_id} has been aborted")
Resources
To learn more about copying blobs with asynchronous scheduling using the Azure Blob Storage client library for Python, see the following resources.
Code samples
REST API operations
The Azure SDK for Python contains libraries that build on top of the Azure REST API, allowing you to interact with REST API operations through familiar Python paradigms. The client library methods covered in this article use the following REST API operations:
- Copy Blob (REST API)
- Abort Copy Blob (REST API)
Client library resources
Related content
- This article is part of the Blob Storage developer guide for Python. To learn more, see the full list of developer guide articles at Build your Python app.