How to Analyze data within OneDrive for Business with Azure Cognitive Services Tools?

Azure Azure 0 Reputation points
2023-07-01T04:29:30.33+00:00

Hello,

I need to analyze a large amount of data within OneDrive for Business. Is there a way to use the Azure Cognitive Services tools Python API to analyze the data within OneDrive without uploading/downloading the data? For Cognitive Search specifically. Additionally, if not, what is the best way to go about downloading a large amount of data from OneDrive for Business onto either an Azure VM or Azure Blob Storage that isn't manual? Thank you.

Any suggestions or advice would be extremely helpful and appreciated.

OneDrive
OneDrive
A Microsoft file hosting and synchronization service.
1,300 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,101 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Ziggy Zulueta 495 Reputation points MVP
    2023-07-01T13:56:32.3266667+00:00

    Hi,

    I believe you want to have a search capability that will access all your files from OneDrive yes?

    I believe the best possible way is thru Azure Cognitive Search. With Azure Cognitive Search text data and even images can be accessed there. You do not need Cognitive Services for Vision to get the data you need.

    For your videos you may need to use the Azure Video Indexer and save the results in the same location you are saving your text data and images.

    The last issue would be the location where all your data gets stored. It seems Azure Cognitive Search does not support OneDrive so you would have to move all your OneDrive data to Azure Blob. These sites may probably help you:

    https://powerusers.microsoft.com/t5/Building-Flows/Copy-new-files-and-folders-from-OneDrive-to-azure-blob-storage/td-p/1941854

    https://learn.microsoft.com/en-us/answers/questions/464671/copy-files-from-onedrive-and-transfer-to-azure-blo

    1 person found this answer helpful.

  2. brtrach-MSFT 17,391 Reputation points Microsoft Employee
    2023-08-02T20:58:07.9966667+00:00

    @Azure Azure I understand that you are facing issues with the Graph API script due to the 429-error code. This error code indicates that you are exceeding the rate limit for the API. One way to overcome this issue is to implement a delay between requests to the API. You can also try to use the Microsoft Graph SDK for Python, which handles the rate limiting automatically.

    Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows to move data between different sources and destinations. You can use Azure Data Factory to copy data from OneDrive for Business to Azure Blob Storage without having to download the data to your local machine.

    To use Azure Data Factory, you would need to create a pipeline that defines the data flow from the source (OneDrive for Business) to the destination (Azure Blob Storage). You can use the OneDrive connector in Azure Data Factory to connect to your OneDrive for Business account and the Azure Blob Storage connector to connect to your Azure Blob Storage account.

    Once you have created the pipeline, you can use Azure Data Factory to schedule the pipeline to run at specific intervals or trigger it manually. Azure Data Factory also provides built-in support for retry policies and parallelism, which can help you overcome any issues related to throttling.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.