Cannot upload local files to AzureML datastore (python SDK)

Question

Hi everybody,

I just started learning how to use MS Azure and I got stuck with an apparently trivial issue.

I have my own pet ML project, a python script that runs a classification analysis with Tensorflow and Keras.
It runs smoothly locally and I am happy with it.

Now I am trying to run this script on Azure ML, hoping to take advantage from the available computing power and in general gaining some experience with the Azure services. I am a bit old style and I like the idea of running my code on my local IDE, rahter than running it in a notebook. Because of this, I focused on the python SDK libraries.

I created a free trial account on Azure and create a workspace. In order to adapt my original code to the
new task, I followed the example in https://learn.microsoft.com/en-us/azure/machine-learning/service/tutorial-train-models-with-aml?WT.mc_id=aisummit-github-amynic

The problem arises when I try to upload my locally-stored training data to the datastore of the workspace. The data is savedlocally in a parquet file, about 70Mb in size. The transfer fails after some time with a ProtocolError. After that it keeps retrying and failing with a NewConnectionError.

The snippet that reproduces the error is:

import numpy as np  
import pandas as pd  
from os.path import join as osjoin  
  
import azureml.core  
from azureml.core import Workspace,Experiment,Dataset,Datastore  
from azureml.core.compute import AmlCompute,ComputeTarget  
  
workdir = "."  
# Set up Azure Workspace  
# load workspace configuration from the config.json file in the current folder.  
try:  
    ws = Workspace.from_config()  
except:  
    print("Could not load AML workspace")  
  
  
datadir= osjoin(workdir,"data")  
local_files = [ osjoin(datadir,f) for f in listdir(datadir) if ".parquet" in f ]  
  
# get the datastore to upload prepared data  
datastore = ws.get_default_datastore()  
datastore.upload_files(files=local_files, target_path=None, show_progress=True)

Everything runs smoothly until the last line. What happens is that the program starts to upload the file,
I can see that there is outbound traffic from my VPN monitor. From the upload speed and the size of the file, I would say that it uploads it completely or close to that, then I get this message * :

WARNING - Retrying (Retry(total=2, connect=3, read=2, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', OSError("(10054, 'WSAECONNRESET')"))': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=1, connect=2, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=0, connect=1, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=2, connect=2, read=3, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=1, connect=1, read=3, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(, 'Connection to creditfraudws2493375317.blob.core.windows.net timed out. (connect timeout=20)')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=0, connect=0, read=3, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D

From the initial ProtocolError, I understand that the Azure cloud server bounces me back, but it is
unclear to me why. Checking the workspace from the Azure portal, I would guess that the container of the workspace is still empty, but I am not 100% sure if I checked that correctly.

Maybe I misunderstood the different components of the storage services in AzureML and I not using
the API correctly. Am I doing something wrong? Is there a way for me to extract more information about
the reasons for this error?

Thanks a lot in advance for any help you can provide

[*] (I manually edited portions of the error message obfuscating the blobstore name)

Answer

@romungi-MSFT

I am having the same issue. With the tutorial here: https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-bring-data

   WARNING - Retrying (Retry(total=0, connect=3, read=0, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection abo  
   rted.', timeout('The write operation timed out'))': /azureml-blobstore-6cf75ce5-9da9-4149-bcfd-c844582dc038/datasets/cifar10/cifar-10-batche  
   s-py/test_batch  
   Uploading ./data/cifar-10-batches-py/data_batch_2  
   \--- Logging error ---  
   Traceback (most recent call last):  
     File "/home/user01/ws/azure-ml-tutorial/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen  
       chunked=chunked,  
     File "/home/user01/ws/azure-ml-tutorial/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 392, in _make_request  
       conn.request(method, url, **httplib_request_kw)  
     File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1252, in request  
       self._send_request(method, url, body, headers, encode_chunked)  
     File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1298, in _send_request  
       self.endheaders(body, encode_chunked=encode_chunked)  
     File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1247, in endheaders  
       self._send_output(message_body, encode_chunked=encode_chunked)  
     File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1065, in _send_output  
       self.send(chunk)  
     File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 987, in send  
       self.sock.sendall(data)  
     File "/home/user01/anaconda3/lib/python3.7/ssl.py", line 1034, in sendall  
       v = self.send(byte_view[count:])  
     File "/home/user01/anaconda3/lib/python3.7/ssl.py", line 1003, in send  
       return self._sslobj.write(data)  
   socket.timeout: The write operation timed out  
     
   During handling of the above exception, another exception occurred:

Three small files are successfully loaded, but it failed at the other actual data files.

I'm using Ubuntu18.04, Python 3.7.6. I'm using a home-wifi which I don't think have a firewall for this.

Any idea?

Answer

Each Azure ML workspace comes with a default datastore:

from azureml.core import Workspace
ws = Workspace.from_config()
datastore = ws.get_default_datastore()

When declaring BlobService pass in protocol='http' to force the service to communicate over HTTP. Note that you must have your container configured to allow requests over HTTP (which it does by default).

client = BlobService(STORAGE_ACCOUNT, STORAGE_KEY, protocol="http")

You can also read more about os path expanduser method. Some example code can be found here: https://gist.github.com/drdarshan/92fff2a12ad9946892df

Cannot upload local files to AzureML datastore (python SDK)

2 answers