How do I prevent a dataset in AzureML from adding subfolders?

Sjoerd Braaksma 0 Reputation points
2023-05-23T07:13:18.3133333+00:00

So I have been working with datasets in Azure machine learning, and recently found out the handy new (experimental) filter feature for datasets.

The current dataset that we have consist of the following folder structure (of course, in reality a flat file system)

main_folder:

  • Contains 848 images for training, .tif format
  • sub_folder
    • Contains cutouts & transformation of the above images, ~900.000, also .tif format. I only want to create a dataset from the 848 files contained in the main_folder. However, every operation/means to do so (whether it's filter by extension, size, etc.) has some major limitations in this case, namely:
    1. It doesn't change the registered dataset. The registered dataset always contains all subfolders
    2. Every filter action still loops over the ~900.000 images in the sub_folder to check the statement, making every operation very lengthy.
    Is there a way to create a dataset from a folder, without evaluating every sub-folder below it?
Not Monitored
Not Monitored
Tag not monitored by Microsoft.
37,766 questions
{count} votes