transfer a huge amount of data from sharepoint to blob with Python

Etienne WAGNER 20 Reputation points
2024-07-19T15:49:30.67+00:00

Hello everyone,

I am trying to copy a huge amount of data from sharepoint to a blob storage. I have coded it in Python using MsGraph SDK and Azure Storage SDK.

I successfully handled the throtling errors from MsGraph API, however I have trouble to handle throtling erros from Azure Storage SDK. I believe that Azure Storage SDK might not be the best way to perform this task. Could you advise me some more efficient technology from Microsoft to load this data into the blob ?

Any help is welcome.

Thank you.

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,208 questions
0 comments No comments
{count} votes

Accepted answer
  1. Amrinder Singh 5,155 Reputation points Microsoft Employee
    2024-07-19T16:35:37.4733333+00:00

    Hi Etienne WAGNER - Thanks for reaching out over Q&A Forum

    I would suggest you to check the below link for a similar ask regarding different option to copy data from sharepoint to Azure Storage.

    https://learn.microsoft.com/en-us/answers/questions/1513111/how-to-copy-folders-and-files-from-a-sharepoint-si

    Apart from that, you mentioned regarding throttling concern and it will be important to understand at which layer the throttling is happening:

    Management Plane/RP Level: These limits are hardcoded and can't be increased. There error will typically be of error code 429 in here.

    https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/request-limits-and-throttling#resource-provider-limits

    Data Plane level due to exceeding of bandwidth limits: Below is the link for scalability limits and it is suggested for application to make calls within the scalability target. In case there are higher bandwidth required, you can explore checking with the quota team regarding the feasibility of the same.

    https://learn.microsoft.com/en-us/azure/storage/common/scalability-targets-standard-account?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json

    Server Side Throttling - This can be due to any backend load balancing event which should last for longer time.

    https://learn.microsoft.com/en-us/azure/storage/blobs/storage-performance-checklist#timeout-and-server-busy-errors

    Having implementation of best practices of retry along with exponential backoff shall help minimizing the impact to the application.

    Hope that helps!

    Let me know if there are any queries/concerns, will be glad to assist.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Nehruji R 8,066 Reputation points Microsoft Vendor
    2024-07-22T06:13:07.51+00:00

    Hello Etienne WAGNER,

    Greetings! Welcome to Microsoft Q&A Platform.

    I understand that you're facing challenges with throttling errors in the Azure Storage SDK and would like to find more efficient solutions with other Microsoft technologies. Where you might consider using Microsoft's SharePoint Migration Tool or SharePoint Online Management Shell in combination with AzCopy. Here's an alternative approach:

    SharePoint Migration Tool:

    Microsoft provides the SharePoint Migration Tool, which is designed to help migrate content from on-premises SharePoint sites and file shares to SharePoint Online and OneDrive for Business. It also supports migration from SharePoint Online to SharePoint Online.

    You can use this tool to copy the content from your SPO library to an Azure Blob container. Here's a high-level outline of the process:

    • Download and install the SharePoint Migration Tool.
    • Set up a new migration project and configure the source (SPO library) and destination (Azure Blob container).
    • Run the migration project to copy the files and folders from SPO to the Azure Blob container.

    This tool handles not just files but also folder structures and metadata. You might need to adapt it slightly for your specific case, but it's generally designed for these types of tasks.

    SharePoint Online Management Shell with AzCopy:

    You can use the SharePoint Online Management Shell to connect to your SharePoint Online environment and retrieve the files' URLs. Then, use AzCopy to copy the files from SharePoint Online to the Azure Blob container.

    Here's a rough outline of the steps:

    • Use SharePoint Online Management Shell to connect to your SharePoint Online environment.
    • Retrieve the URLs of the files you want to copy.
    • Use AzCopy to copy the files from their URLs to the Azure Blob container.

    This approach might require some scripting to automate the process, especially if you have a large number of files.

    Be aware that the exact steps and scripts you need might depend on your specific environment, permissions, and other factors. 

    Similar thread for reference - https://learn.microsoft.com/en-us/answers/questions/1513111/how-to-copy-folders-and-files-from-a-sharepoint-si,https://stackoverflow.com/questions/64703982/most-efficient-way-to-upload-large-amounts-of-data-to-azure-blobstorage.

    reference - https://learn.microsoft.com/en-us/azure/storage/common/storage-solution-large-dataset-moderate-high-network.Hope this information helps! Please let us know if you have any further queries. I’m happy to assist you further.   


    Please "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.