Data Labeling project fails with “Dataset refresh has failed” despite valid PNGs and credentials

Question

Matti Akbari 1 Microsoft Employee

Description:

I'm stuck trying to create a Data Labeling project in Azure Machine Learning Studio, and I keep getting this error:

"The dataset refresh has failed. Verify the project's datastore credentials are correct and the dataset contains datapoints."

I've spent hours troubleshooting this. Here's what I've confirmed:

✅ My Blob Storage container has valid .png images (no nested folders, standard size, visible in portal preview).

✅ I registered the dataset using a script, and ml_client.data.get() confirms the path is correct.

✅ My workspace managed identity has these roles on the storage account: – Reader – Storage Blob Data Reader

✅ My own user identity also has those same roles.

✅ I enabled “Use workspace managed identity for data preview and profiling” during datastore credential setup.

✅ I'm using identity-based access, not shared keys.

✅ I even deleted and recreated the ML workspace, datastore, and dataset from scratch. Same issue.

✅ I tried flattening the folder structure (no recursion), renaming dataset, and resaving it with different paths. Still nothing. 🧠 I suspect the Data Labeling UI is failing to preview the dataset properly even though it’s 100% accessible from code and the images are valid. Maybe a regression? It was working before.

Your answer