Use blob index tags to manage and find data with Python
This article shows how to use blob index tags to manage and find data using the Azure Storage client library for Python.
To learn about setting blob index tags using asynchronous APIs, see Set blob index tags asynchronously.
Prerequisites
- This article assumes you already have a project set up to work with the Azure Blob Storage client library for Python. To learn about setting up your project, including package installation, adding
import
statements, and creating an authorized client object, see Get started with Azure Blob Storage and Python. - The authorization mechanism must have permissions to work with blob index tags. To learn more, see the authorization guidance for the following REST API operations:
About blob index tags
Blob index tags categorize data in your storage account using key-value tag attributes. These tags are automatically indexed and exposed as a searchable multi-dimensional index to easily find data. This article shows you how to set, get, and find data using blob index tags.
Blob index tags aren't supported for storage accounts with hierarchical namespace enabled. To learn more about the blob index tag feature along with known issues and limitations, see Manage and find Azure Blob data with blob index tags.
Set tags
You can set index tags if your code has authorized access to blob data through one of the following mechanisms:
- Security principal that is assigned an Azure RBAC role with the Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags/write action. The Storage Blob Data Owner is a built-in role that includes this action.
- Shared Access Signature (SAS) with permission to access the blob's tags (
t
permission) - Account key
For more information, see Setting blob index tags.
You can set tags by using the following method:
The specified tags in this method will replace existing tags. If old values must be preserved, they must be downloaded and included in the call to this method. The following example shows how to set tags:
def set_blob_tags(self, blob_service_client: BlobServiceClient, container_name):
blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")
# Get any existing tags for the blob if they need to be preserved
tags = blob_client.get_blob_tags()
# Add or modify tags
updated_tags = {'Sealed': 'false', 'Content': 'image', 'Date': '2022-01-01'}
tags.update(updated_tags)
blob_client.set_blob_tags(tags)
You can delete all tags by passing an empty dict
object into the set_blob_tags
method:
def clear_blob_tags(self, blob_service_client: BlobServiceClient, container_name):
blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")
# Pass in empty dict object to clear tags
tags = dict()
blob_client.set_blob_tags(tags)
Get tags
You can get index tags if your code has authorized access to blob data through one of the following mechanisms:
- Security principal that is assigned an Azure RBAC role with the Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags/read action. The Storage Blob Data Owner is a built-in role that includes this action.
- Shared Access Signature (SAS) with permission to access the blob's tags (
t
permission) - Account key
For more information, see Getting and listing blob index tags.
You can get tags by using the following method:
The following example shows how to retrieve and iterate over the blob's tags:
def get_blob_tags(self, blob_service_client: BlobServiceClient, container_name):
blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")
tags = blob_client.get_blob_tags()
print("Blob tags: ")
for k, v in tags.items():
print(k, v)
Filter and find data with blob index tags
You can use index tags to find and filter data if your code has authorized access to blob data through one of the following mechanisms:
- Security principal that is assigned an Azure RBAC role with the Microsoft.Storage/storageAccounts/blobServices/containers/blobs/filter/action action. The Storage Blob Data Owner is a built-in role that includes this action.
- Shared Access Signature (SAS) with permission to filter blobs by tags (
f
permission) - Account key
For more information, see Finding data using blob index tags.
Note
You can't use index tags to retrieve previous versions. Tags for previous versions aren't passed to the blob index engine. For more information, see Conditions and known issues.
You can find data by using the following method:
The following example finds and lists all blobs tagged as an image:
def find_blobs_by_tags(self, blob_service_client: BlobServiceClient, container_name):
container_client = blob_service_client.get_container_client(container=container_name)
query = "\"Content\"='image'"
blob_list = container_client.find_blobs_by_tags(filter_expression=query)
print("Blobs tagged as images")
for blob in blob_list:
print(blob.name)
Set blob index tags asynchronously
The Azure Blob Storage client library for Python supports working with blob index tags asynchronously. To learn more about project setup requirements, see Asynchronous programming.
Follow these steps to set blob index tags using asynchronous APIs:
Add the following import statements:
import asyncio from azure.identity.aio import DefaultAzureCredential from azure.storage.blob.aio import BlobServiceClient
Add code to run the program using
asyncio.run
. This function runs the passed coroutine,main()
in our example, and manages theasyncio
event loop. Coroutines are declared with the async/await syntax. In this example, themain()
coroutine first creates the top levelBlobServiceClient
usingasync with
, then calls the method that sets the blob index tags. Note that only the top level client needs to useasync with
, as other clients created from it share the same connection pool.async def main(): sample = BlobSamples() # TODO: Replace <storage-account-name> with your actual storage account name account_url = "https://<storage-account-name>.blob.core.windows.net" credential = DefaultAzureCredential() async with BlobServiceClient(account_url, credential=credential) as blob_service_client: await sample.set_blob_tags(blob_service_client, "sample-container") if __name__ == '__main__': asyncio.run(main())
Add code to set the blob index tags. The code is the same as the synchronous example, except that the method is declared with the
async
keyword and theawait
keyword is used when calling theget_blob_tags
andset_blob_tags
methods.async def set_blob_tags(self, blob_service_client: BlobServiceClient, container_name): blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt") # Get any existing tags for the blob if they need to be preserved tags = await blob_client.get_blob_tags() # Add or modify tags updated_tags = {'Sealed': 'false', 'Content': 'image', 'Date': '2022-01-01'} tags.update(updated_tags) await blob_client.set_blob_tags(tags)
With this basic setup in place, you can implement other examples in this article as coroutines using async/await syntax.
Resources
To learn more about how to use index tags to manage and find data using the Azure Blob Storage client library for Python, see the following resources.
REST API operations
The Azure SDK for Python contains libraries that build on top of the Azure REST API, allowing you to interact with REST API operations through familiar Python paradigms. The client library methods for managing and using blob index tags use the following REST API operations:
- Get Blob Tags (REST API)
- Set Blob Tags (REST API)
- Find Blobs by Tags (REST API)
Code samples
- View synchronous or asynchronous code samples from this article (GitHub)
Client library resources
See also
Зворотний зв’язок
https://aka.ms/ContentUserFeedback.
Очікується незабаром: протягом 2024 року ми будемо припиняти використання механізму реєстрації проблем у GitHub для зворотного зв’язку щодо вмісту й замінювати його новою системою зворотного зв’язку. Докладніше:Надіслати й переглянути відгук про