Move data to and from Azure Blob storage
The Team Data Science Process requires that data be ingested or loaded into a variety of different storage environments to be processed or analyzed in the most appropriate way in each stage of the process. Azure Blob Storage has comprehensive documentation at this link but this section in TDSP documentation provides a summary starter.
Different technologies for moving data
The following articles describe how to move data to and from Azure Blob storage using different technologies.
Which method is best for you depends on your scenario. The Scenarios for advanced analytics in Azure Machine Learning article helps you determine the resources you need for a variety of data science workflows used in the advanced analytics process.
Note
For a complete introduction to Azure blob storage, refer to Azure Blob Basics and to Azure Blob Service.
Using Azure Data Factory
As an alternative, you can use Azure Data Factory to do the following:
- Create and schedule a pipeline that downloads data from Azure Blob storage.
- Pass it to a published Azure Machine Learning web service.
- Receive the predictive analytics results.
- Upload the results to storage.
For more information, see Create predictive pipelines using Azure Data Factory and Azure Machine Learning.
Prerequisites
This article assumes that you have an Azure subscription, a storage account, and the corresponding storage key for that account. Before uploading/downloading data, you must know your Azure Storage account name and account key.
- To set up an Azure subscription, see Free one-month trial.
- For instructions on creating a storage account and for getting account and key information, see About Azure Storage accounts.
Contributors
This article is maintained by Microsoft. It was originally written by the following contributors.
Principal author:
- Mark Tabladillo | Senior Cloud Solution Architect
To see non-public LinkedIn profiles, sign in to LinkedIn.
Next steps
- Introduction to Azure Blob Storage
- Copy and move blobs from one container or storage account to another
- What is the Team Data Science Process (TDSP)?
Related resources
Feedback
Submit and view feedback for