Azure ML Dataset and Snapshot

keonabut 11 Reputation points Microsoft Employee
2020-06-01T22:24:46.117+00:00

Hi experts,

My customer want to snapshot datasets for reproducibility. I found method "create_snapshot", but found that it is deprecated. Is there any alternative way for dataset snapshot ?

Thanks,
Keita

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,561 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. GiftA-MSFT 11,151 Reputation points
    2020-06-02T21:08:04.04+00:00

    Currently datasets don't have snapshot capabilities. However, you can develop a heuristic where you create a snapshot of your data via blob (i.e if they are using blob). With the new dataset API, you are able to version and track datasets. A version will refer to your data but won't create a point in time snapshot. Hence, we recommend that you format your data to be in folders, so that when new data is added, it creates a folder for it, then the version will refer to old data (old folder) plus the new data (new folder). Please check out this document on how to version and track Azure Machine Learning datasets for reproducibility.