Azure Machine Learning - Access uri_folder dataset (ML v2) from notebook (not job)

G Cocci 211 Reputation points Microsoft Employee
2022-10-26T14:26:01.98+00:00

Hi all,

I've registered in Azure Machine Learning a Data Lake Gen2 datastore that point to a container with a hierarchy of folders that contain avro files and on top of it I registered a folder_uri dataset (ML v2).

Now I want to access to these folders from a notebook, convert them in a pandas dataframe in order to do some data exploration.

I search on the documentation, and I only found examples that run job and using this type of dataset as input, but I need to be able to explore it using notebook.

Is it possible? How can I do it?

Thanks

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,499 questions
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,002 questions
0 comments No comments
{count} votes

Accepted answer
  1. Ramr-msft 17,741 Reputation points
    2022-10-27T12:45:05.14+00:00

    @G Cocci Thanks for the question. Currently it's not supported to access the avro files. Here is the document for accessing the datastore using folder_uri dataset.
    https://learn.microsoft.com/en-us/azure/machine-learning/migrate-to-v2-resource-datastore

    Mapping Data Flow supports AVRO as a source type https://learn.microsoft.com/en-us/azure/data-factory/data-flow-source#supported-sources

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.