Share via

Azure Machine Learning - Access uri_folder dataset (ML v2) from notebook (not job)

G Cocci 226 Reputation points Microsoft Employee
2022-10-26T14:26:01.98+00:00

Hi all,

I've registered in Azure Machine Learning a Data Lake Gen2 datastore that point to a container with a hierarchy of folders that contain avro files and on top of it I registered a folder_uri dataset (ML v2).

Now I want to access to these folders from a notebook, convert them in a pandas dataframe in order to do some data exploration.

I search on the documentation, and I only found examples that run job and using this type of dataset as input, but I need to be able to explore it using notebook.

Is it possible? How can I do it?

Thanks

Azure Data Lake Storage
Azure Data Lake Storage

An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.

Azure Machine Learning
0 comments No comments
{count} votes

Answer accepted by question author
  1. Ramr-msft 17,836 Reputation points
    2022-10-27T12:45:05.14+00:00

    @G Cocci Thanks for the question. Currently it's not supported to access the avro files. Here is the document for accessing the datastore using folder_uri dataset.
    https://learn.microsoft.com/en-us/azure/machine-learning/migrate-to-v2-resource-datastore

    Mapping Data Flow supports AVRO as a source type https://learn.microsoft.com/en-us/azure/data-factory/data-flow-source#supported-sources

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.