Hi,
You can create dataset with globing pattern.
ds = Dataset.File.from_files((datastore, '[EventHub]/[Partition]/**))
The mount time should be less than 1 min.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hi,
I need to create a dataset in Azure Machine Learning service from an Azure Data Lake Gen2 registered as a Datastore. Data in the lake are 1000's of avro files stored by an Event Hub Capture following the pattern [EventHub]/[Partition]/[YYYY]/[MM]/[DD]/[HH]/[mm]/[ss], so there is one path for each file.
According to the datasets documentation it is recommended "... creating dataset referencing less than 100 paths in datastores for optimal performance."
What would be the alternative/recommended approach in my application? Streaming data are continuously captured by the Event Hub.
Thanks
Hi,
You can create dataset with globing pattern.
ds = Dataset.File.from_files((datastore, '[EventHub]/[Partition]/**))
The mount time should be less than 1 min.