OSError: Cannot save file into a non-existent directory

Question

I am using Azure ML Studio to read data from a csv file by creating a data asset test5 and write data into a csv file for my current working directory (which is failing). I am submitting a Job using a Compute Cluster and a Custom Environment and I am following the instructions from the tutorial: https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-azure-ml-in-a-day

I have written the code in a notebook cell as:

# Handle to the workspace  
from azure.ai.ml import MLClient  
  
# Authentication package  
from azure.identity import DefaultAzureCredential  
credential = DefaultAzureCredential()  
  
# Get a handle to the workspace  
ml_client = MLClient(  
    credential=credential,  
    subscription_id="abc",  
    resource_group_name="xyz",  
    workspace_name="pqr",  
)  
from azure.ai.ml import command  
from azure.ai.ml import Input  
  
registered_model_name = "read_data"  
env_name = "docker-context"  
job = command(inputs=dict(  
        data=Input(  
            type="uri_file",  
            path="azureml:test5:1",  
        ),  
        registered_model_name=registered_model_name  
    ),     
    code="./src/",  # location of source code  
    command="python main.py --data ${{inputs.data}} --registered_model_name ${{inputs.registered_model_name}}",  
    environment="docker-context:10",  
    compute="amlcluster01",  
    experiment_name="read_data1",  
    display_name="read_data2",  
    )  
ml_client.create_or_update(job)

This works fine. The content of the main.py is:

import os  
import argparse  
import pandas as pd  
  
def main():  
    print("Hello")  
     # input and output arguments  
    parser = argparse.ArgumentParser()  
    parser.add_argument("--data", type=str, help="path to input data")  
    parser.add_argument("--registered_model_name", type=str, help="model name")  
    args = parser.parse_args()  
    print(" ".join(f"{k}={v}" for k, v in vars(args).items()))  
    print("input data:", args.data)  
    read_data=pd.read_csv(args.data)  
    #read_data=pd.read_parquet(args.data, engine='pyarrow')  
    #credit_df = pd.read_excel(args.data, header=1, index_col=0)  
    print(read_data)  
    read_data.to_csv(r'/home/azureuser/cloudfiles/code/Users/Ankit19.Gupta/azureml-in-a-day/src/file3.csv')  
  
    print("Hello World !")  
  
if __name__ == "__main__":  
    main()

Here, all lines of code work fine except read_data.to_csv(r'/home/azureuser/cloudfiles/code/Users/Ankit19.Gupta/azureml-in-a-day/src/file3.csv').

It shows the error message as: OSError: Cannot save file into a non-existent directory:/home/azureuser/cloudfiles/code/Users/Ankit19.Gupta/azureml-in-a-day/src

Can anyone please help me how to save dataframe into a csv file into my current working directory through a Job. Any help would be appreciated.

Answer

Helllo @Ankit19 Gupta

Thanks for using Microsoft Q&A platform, for "Reading and Writing data in a job" in official guidance, please refer to below sample for ML SDK V2 - https://github.com/Azure/azureml-examples/blob/sdk-preview/sdk/assets/data/data.ipynb

If that's not want you want, I have done some researches around it and found a thread about the same issue in Stack - https://stackoverflow.com/questions/47143836/pandas-dataframe-to-csv-raising-ioerror-no-such-file-or-directory

It seems this error was caused by to_csv does create the file if it doesn't exist as you said, but it does not create directories that don't exist. Ensure that the subdirectory you are trying to save your file within has been created first as below -

import os  
  
outname = 'name.csv'  
  
outdir = './dir'  
if not os.path.exists(outdir):  
    os.mkdir(outdir)  
  
fullname = os.path.join(outdir, outname)      
  
df.to_csv(fullname)

Please have a try and I hope above helps, let me know how is going and we are happy to help.

Regards,
Yutong

-Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

Share via

OSError: Cannot save file into a non-existent directory

1 answer

Your answer