Azureml dataset consumption does not work when whitespaces are in the file name

Jan-Ruben Schmid 6 Reputation points
2022-02-17T15:21:46.257+00:00

Hello,
when i previously exported the data from a data labeling project to Azure ML dataset, i could consume them with the azureml.contrib.dataset

This seems to be not supported anymore, therefor i tried to download them with the azureml.core Dataset package.
It seems to works only for data which have no white space in their name.
The dataset is a Tabular dataset, i download them with:

from azureml.core import Workspace, Dataset
workspace = Workspace(subscription_id, resource_group, workspace_name)
dataset = Dataset.get_by_name(workspace, name='my_dataset_name')
dataset.download('image_url', target_path='./', overwrite=True)

Error:

AzureMLException:
Message: Some files have failed to download:('workspaceblobstore/data/quay_data_cra/IMG_0082%20copy%207.jpg', 'Microsoft.DPrep.ErrorValues.DownloadFailed')
('workspaceblobstore/data/quay_data_cra/IMG_0543%20copy%205.jpg', 'Microsoft.DPrep.ErrorValues.DownloadFailed')

How can i download the dataset?

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,918 questions
{count} vote

4 answers

Sort by: Most helpful
  1. YutongTie-MSFT 51,766 Reputation points
    2022-03-01T10:57:25.963+00:00

    Hello @Jan-Ruben Schmid

    We are sorry not to hear you back. Just want to update here, I have forwarded this issue to product team to investigate.

    For workaround, please try "file_handling_option" to see if that works for you. Please let us know if you are still blocked. Thanks a lot.

    Regards,
    Yutong


  2. Jan-Ruben Schmid 6 Reputation points
    2022-03-08T11:26:04.7+00:00

    Hello @YutongTie-MSFT ,
    i have just installed the latest version of azure ml core (1.39.0) and it seems the issue is still not fixed, is there any progress on your side?

    Regards,
    Jan-Ruben

    0 comments No comments

  3. Edgar Bahilo Rodríguez 6 Reputation points
    2022-03-25T16:39:32.207+00:00

    Hi @YutongTie-MSFT

    I have exactly the same problem despite my files not having blanks.

    My azure dataset name is:

    preprocess_config.json

    Path in blob is:

    preprocess_configs/preprocess_config_day2022-03-24_16:28.json.

    When trying to download:

    parser = argparse.ArgumentParser()  
    
    # DEFAULT ARGUMENTS  
    parser.add_argument('--dataset', type=str,  
                        help='Data to score')  
    parser.add_argument("--model_name", type=str,  
                        help="Model to be used for scoring",  
                        default="model_12")  
    parser.add_argument("--register_config_name", type=str,  
                        help="Model to be used for scoring",  
                        default="preprocess_config.json")  
    args, _ = parser.parse_known_args()  
    
    preprocessing_config_dataset = Dataset.get_by_name(ws, name=args.register_config_name, version='latest')  
    test = preprocessing_config_dataset.download(target_path='.', overwrite=True)  
    > Traceback (most recent call last):  
    >   File "<stdin>", line 1, in <module>  
    >   File "/anaconda/envs/xgboost_gpu/lib/python3.8/site-packages/azureml/data/_loggerfactory.py", line 132, in wrapper  
    >     return func(*args, **kwargs)  
    >   File "/anaconda/envs/xgboost_gpu/lib/python3.8/site-packages/azureml/data/file_dataset.py", line 177, in download  
    >     download_list = _get_and_validate_download_list(download_records,  
    >   File "/anaconda/envs/xgboost_gpu/lib/python3.8/site-packages/azureml/data/file_dataset.py", line 687, in _get_and_validate_download_list  
    >     _download_error_handler(error_list)  
    >   File "/anaconda/envs/xgboost_gpu/lib/python3.8/site-packages/azureml/data/dataset_error_handling.py", line 188, in _download_error_handler  
    >     raise UserErrorException(message) if all_user_errors else AzureMLException(message)  
    > azureml.exceptions._azureml_exception.UserErrorException: UserErrorException:  
    >         Message: Some files have failed to download:('Path', 'Microsoft.DPrep.ErrorValues.InvalidArgument')  
    >         InnerException None  
    >         ErrorResponse   
    > {  
    >     "error": {  
    >         "code": "UserError",  
    >         "message": "Some files have failed to download:('Path', 'Microsoft.DPrep.ErrorValues.InvalidArgument')"  
    >     }  
    > }  
    
    0 comments No comments

  4. Anonymous
    2023-05-02T21:21:19.15+00:00

    Did this issue get resolved?. I am facing the same and file_handling_option does not work.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.