Use the azureml-fsspec
package:
pip install azureml-fsspec
Note: The accepted URI format for the datastore URI is:
azureml://subscriptions/([^/]+)/resourcegroups/([^/]+)/workspaces/([^/]+)/datastores/([^/]+)/paths/([^/]+)
This should technically work:
import azureml-fsspec
import pandas as pd
# credentials and variables
subscription = '<subscription_id>'
resource_group = '<resource_group>'
workspace = '<workspace>'
datastore_name = '<datastore>'
path_on_datastore '<path>'
file = '<myfile.csv>'
# generate uri:
uri = f'azureml://subscriptions/{subscription}/resourcegroups/{resource_group}/workspaces/{workspace}/datastores/{datastore_name}/paths/{path_on_datastore}/{file}'
# read via pandas
df = pd.read_csv(uri)
See Azure Machine Learning - Access Data from Azure Cloud Storage During Interactive Development for details.
or you could try the AzureMachineLearningFileSystem
class from the package:
import pandas
from azureml.fsspec import AzureMachineLearningFileSystem
# instantiate file system using following URI
fs = AzureMachineLearningFileSystem('azureml://subscriptions/<subid>/resourcegroups/<rgname>/workspaces/<workspace_name>/datastore/datastorename')
fs.ls() # list folders/files in datastore 'datastorename'
# use an open context
with fs.open('./folder1/file1.csv') as f:
# do some process
df = pandas.read_csv(f)