Share via

Getting invalid dataflow in Azure ML Studio batch inference. Worked weeks ago

David-3633 131 Reputation points
2023-10-11T16:31:17.99+00:00

I have some code performing batch inference on Azure. The directories I want to run inference on are loaded into a MLTable file, then I pass this to the batch inference pipeline component. It worked fine a few weeks ago when I last modified it, but when I went to run it yesterday something has changed, and I get an error saying the dataflow is invalid:

[2023-10-11 16:07:46Z] Job failed, job RunId is <run id>. Error: {"Error":{"Code":"UserError","Severity":null,"Message":"
Error Code: ScriptExecution.StreamAccess.Unexpected
Native Error: error in streaming from input data sources
    StreamError(Unknown(\"Dataflow at inmemory://dataflow/<dataflow id> is not valid.\", Some(DataflowInvalid(\"inmemory://dataflow/<dataflow id>\", VisitError(ExecutionError(StreamError(InvalidInput(InvalidUri { message: \"invalid uri format\", uri: \"azureml://subscriptions/<subscription id>/resourcegroups/<resource group name>/workspaces/<workspace name>/datastores/workspaceblobstore/paths/LocalUpload/<another id>/input_data/azureml://datastores/<datastore name>/paths/<folder name>/**.png\" }))))))))
=> Dataflow at inmemory://dataflow/<dataflow id> is not valid.
    Unknown(\"Dataflow at inmemory://dataflow/<dataflow id> is not valid.\", Some(DataflowInvalid(\"inmemory://dataflow/<dataflow id>\", VisitError(ExecutionError(StreamError(InvalidInput(InvalidUri { message: \"invalid uri format\", uri: \"azureml://subscriptions/<subscription id>/resourcegroups/<resource group name>/workspaces/<workspace name>/datastores/workspaceblobstore/paths/LocalUpload/<another id>/input_data/azureml://datastores/<datastore name>/paths/<folder name>/**.png\" })))))))
=> Dataflow at inmemory://dataflow/<dataflow id> is not valid.
    DataflowInvalid(\"inmemory://dataflow/<dataflow id>\", VisitError(ExecutionError(StreamError(InvalidInput(InvalidUri { message: \"invalid uri format\", uri: \"azureml://subscriptions/<subscription id>/resourcegroups/<resource group name>/workspaces/<workspace name>/datastores/workspaceblobstore/paths/LocalUpload/<another id>/input_data/azureml://datastores/<datastore name>/paths/<folder name>/**.png\" })))))
Error Message: Got unexpected error: Dataflow at inmemory://dataflow/<dataflow id> is not valid.. DataflowInvalid(\"inmemory://dataflow/<dataflow id>\", VisitError(ExecutionError(StreamError(InvalidInput(InvalidUri { message: \"invalid uri format\", uri: \"azureml://subscriptions/<subscription id>/resourcegroups/<resource group name>/workspaces/<workspace name>/datastores/workspaceblobstore/paths/LocalUpload/<another id>/input_data/azureml://datastores/<datastore name>/paths/<folder name>/**.png\" })))))| session_id=<session id>","MessageFormat":null,"MessageParameters":{},"ReferenceCode":null,"DetailsUri":null,"Target":null,"Details":[],"InnerError":null,"DebugInfo":null,"AdditionalInfo":null},"Correlation":null,"Environment":null,"Location":null,"Time":"0001-01-01T00:00:00+00:00","ComponentName":"CommonRuntime"}

(I've cleaned out all IDs as I don't know what's sensitive)

I had been using short form URIs like

azureml://datastores/{datastore}/paths/{folder}/**.png

as used here, for example, but this appears to have suddenly broken. I tried switching to fully qualified URIs (so including my subscription ID and resource group etc. but it also doesn't work, as it seems to add an extra slash, which I didn't have in my code:

azureml uri must follow pattern azureml://subscriptions/<subscription>/resourcegroups/<resourcegroup>/workspaces/<workspace>/...", uri: "azureml:///subscriptions/<subscription id>/resourcegroups/<resource group name>/workspaces/<workspace name>/datastores/<datastore name>/paths/<folder name>/**.png"

Note that I replaced the subscription name etc. in the second string.

The Python string I had defined was

f'azureml://datastores/<datastore name>/paths/{folder}/**.png'

I need this fixed quite urgently, so any help would be appreciated.

Azure Machine Learning

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.