OSError: Cannot save file into a non-existent directory

Question

OSError: Cannot save file into a non-existent directory

Ankit19 Gupta 46

I am using Azure ML Studio to read data from a csv file by creating a data asset test5 and write data into a csv file for my current working directory (which is failing). I am submitting a Job using a Compute Cluster and a Custom Environment and I am following the instructions from the tutorial: https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-azure-ml-in-a-day

I have written the code in a notebook cell as:

# Handle to the workspace  
from azure.ai.ml import MLClient  
  
# Authentication package  
from azure.identity import DefaultAzureCredential  
credential = DefaultAzureCredential()  
  
# Get a handle to the workspace  
ml_client = MLClient(  
    credential=credential,  
    subscription_id="abc",  
    resource_group_name="xyz",  
    workspace_name="pqr",  
)  
from azure.ai.ml import command  
from azure.ai.ml import Input  
  
registered_model_name = "read_data"  
env_name = "docker-context"  
job = command(inputs=dict(  
        data=Input(  
            type="uri_file",  
            path="azureml:test5:1",  
        ),  
        registered_model_name=registered_model_name  
    ),     
    code="./src/",  # location of source code  
    command="python main.py --data ${<!-- -->{inputs.data}} --registered_model_name ${<!-- -->{inputs.registered_model_name}}",  
    environment="docker-context:10",  
    compute="amlcluster01",  
    experiment_name="read_data1",  
    display_name="read_data2",  
    )  
ml_client.create_or_update(job)

This works fine. The content of the main.py is:

import os  
import argparse  
import pandas as pd  
  
def main():  
    print("Hello")  
     # input and output arguments  
    parser = argparse.ArgumentParser()  
    parser.add_argument("--data", type=str, help="path to input data")  
    parser.add_argument("--registered_model_name", type=str, help="model name")  
    args = parser.parse_args()  
    print(" ".join(f"{k}={v}" for k, v in vars(args).items()))  
    print("input data:", args.data)  
    read_data=pd.read_csv(args.data)  
    #read_data=pd.read_parquet(args.data, engine='pyarrow')  
    #credit_df = pd.read_excel(args.data, header=1, index_col=0)  
    print(read_data)  
    read_data.to_csv(r'/home/azureuser/cloudfiles/code/Users/Ankit19.Gupta/azureml-in-a-day/src/file3.csv')  
  
    print("Hello World !")  
  
if __name__ == "__main__":  
    main()

Here, all lines of code work fine except read_data.to_csv(r'/home/azureuser/cloudfiles/code/Users/Ankit19.Gupta/azureml-in-a-day/src/file3.csv').

It shows the error message as: OSError: Cannot save file into a non-existent directory:/home/azureuser/cloudfiles/code/Users/Ankit19.Gupta/azureml-in-a-day/src

Can anyone please help me how to save dataframe into a csv file into my current working directory through a Job. Any help would be appreciated.

1 answer

Your answer

Answer 1

Helllo @Ankit19 Gupta

Thanks for using Microsoft Q&A platform, for "Reading and Writing data in a job" in official guidance, please refer to below sample for ML SDK V2 - https://github.com/Azure/azureml-examples/blob/sdk-preview/sdk/assets/data/data.ipynb

If that's not want you want, I have done some researches around it and found a thread about the same issue in Stack - https://stackoverflow.com/questions/47143836/pandas-dataframe-to-csv-raising-ioerror-no-such-file-or-directory

It seems this error was caused by to_csv does create the file if it doesn't exist as you said, but it does not create directories that don't exist. Ensure that the subdirectory you are trying to save your file within has been created first as below -

import os  
  
outname = 'name.csv'  
  
outdir = './dir'  
if not os.path.exists(outdir):  
    os.mkdir(outdir)  
  
fullname = os.path.join(outdir, outname)      
  
df.to_csv(fullname)

Please have a try and I hope above helps, let me know how is going and we are happy to help.

Regards,
Yutong

-Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

Ankit19 Gupta 46 Reputation points

2022-10-24T07:10:31.077+00:00

Hi @YutongTie-MSFT , Thank you for the answer. I have tried this code and still getting the same error. I have already created these folders in the "Azure ML workspace".

The issue is: When I submit the job, the code is uploaded on the cloud and it goes to a different path on cloud and saved the file in that working directory. I have tried the following code to see in which directory it saved the file:

file_path = os.path.join(os.getcwd(), 'file3.csv') read_data.to_csv(file_path) print(f"read_data saved to {file_path}") It shows the output as `read_data saved to /mnt/azureml/cr/j/28bec1ee580a400894b48e5d8576f6ca/exe/wd/file3.csv`.

So, I can see the path for the current working directory on cloud but I am unable to access it through terminal and so, I am unable to save this file for my Azure ML Workspace within src directory. Could you please help ?

Share via

OSError: Cannot save file into a non-existent directory

1 answer

Your answer