Disk Full Error When running Azure ML Jobs using Custom Environemnts

Lasal Jayawardena 1 Reputation point
2022-11-08T11:34:03.717+00:00

I get a disk full error while running a Model training job using Azure ML SDK launched from Azure DevOps. I used a custom environment inside the Azure ML Workspace and used it.

I am using azure CLI tasks in Azure DevOps to launch these training jobs. How can I resolve this?

Error Message:
"error": {
"code": "UserError",
"message": "{\"Compliant\":\"Disk full while running job. Please consider reducing amount of data accessed, or upgrading VM SKU. Total space: 14045 MB, available space: 1103 MB.\"}\n{\n \"code\": \"DiskFullError\",\n \"target\": \"\",\n \"category\": \"UserError\",\n \"error_details\": []\n}",
"messageParameters": {},
"details": []
},

The .runcoionfig file is as follows:

framework: Python  
script: cnn_training.py  
communicator: None  
autoPrepareEnvironment: true  
maxRunDurationSeconds:  
nodeCount: 1  
environment:  
  name: cnn_training  
  python:  
    userManagedDependencies: true  
    interpreterPath: python  
  docker:  
    enabled: true  
    baseImage: 54646eeace594cf19143dad3c7f31661.azurecr.io/azureml/azureml_b17300b63a1c2abb86b2e774835153ee  
    sharedVolumes: true  
    gpuSupport: false  
    shmSize: 2g  
    arguments: []  
history:  
  outputCollection: true  
  snapshotProject: true  
  directoriesToWatch:  
  - logs  
dataReferences:  
  workspaceblobstore:  
    dataStoreName: workspaceblobstore  
    pathOnDataStore: dataname  
    mode: download  
    overwrite: true  
    pathOnCompute:   
Not Monitored
Not Monitored
Tag not monitored by Microsoft.
37,104 questions
{count} votes