Requirements/Limitations on Tags

Dominik2359 20 Reputation points
2024-08-23T07:34:04.9633333+00:00

We have x million files (and growing). Does it make sense to keep the original file name as a index tag or should be keep it in properties? Original file name is quasi unique so there will be around x million tags. Will the search on tags be perfomant on such amount of tags? Are there any limitations on tags? I guess tags grouping the filenames would make more sense.

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,787 questions
{count} votes

Accepted answer
  1. Nehruji R 7,211 Reputation points Microsoft Vendor
    2024-08-26T07:15:53.0433333+00:00

    Hello Dominik2359,

    Greetings! Welcome to Microsoft Q&A Platform.

    Blob index tags provide data management and discovery capabilities by using key-value index tag attributes. You can categorize and find objects within a single container or across all containers in your storage account. As data requirements change, objects can be dynamically categorized by updating their index tags. Objects can remain in-place with their current container organization.

    For more information: Refer to this article

    • Use blob index tags to manage and find data on Azure Blob Storage
    • Index large datasets Using tags for indexing can be efficient, especially if your system supports multi-dimensional categorization. For example, Azure Blob Storage natively indexes blob tags, allowing for quick data retrieval.
    • Tags can handle a large number of files, but performance may vary depending on the indexing system and infrastructure.
    • Tags can make searches faster if the indexing system is optimized for tag-based queries. However, with millions of tags, the system’s ability to handle such volume efficiently is crucial.

    Properties:

    Storing the original file name in properties can be beneficial if your search system is optimized for property-based searches. This approach can also help in reducing the complexity of managing tags.

    Scalability: Properties are generally scalable, but the search performance will depend on how well the indexing system handles property-based searches.

    Search Efficiency: Searching by properties can be efficient, especially if the indexing system supports property-based queries effectively.

    Recommendations:

    Grouping filenames into tags can make sense if you have a logical categorization that can reduce the number of unique tags. This can improve search performance and manageability.

    Else you can consider a hybrid approach where you use tags for broad categorization and properties for more specific details. This can balance performance and manageability.

    Ensure that your indexing system is optimized for the type of searches you need. Some systems are better suited for tag-based searches, while others excel with property-based searches.

    Limitations

    System Capabilities: The limitations will largely depend on the capabilities of your indexing system. Ensure that it can handle the volume and complexity of your data.

    Search Performance: With millions of tags, search performance can degrade if the system is not optimized. Regularly monitor and optimize your indexing strategy

    refer https://learn.microsoft.com/en-us/azure/storage/blobs/storage-manage-find-blobs?tabs=azure-portal

    Hope this answer helps! Please let us know if you have any further queries. I’m happy to assist you further.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.