Cannot upload local files to AzureML datastore (python SDK)

Schade 1 Reputation point
2020-07-08T08:37:19.043+00:00

Hi everybody,

I just started learning how to use MS Azure and I got stuck with an apparently trivial issue.

I have my own pet ML project, a python script that runs a classification analysis with Tensorflow and Keras.
It runs smoothly locally and I am happy with it.

Now I am trying to run this script on Azure ML, hoping to take advantage from the available computing power and in general gaining some experience with the Azure services. I am a bit old style and I like the idea of running my code on my local IDE, rahter than running it in a notebook. Because of this, I focused on the python SDK libraries.

I created a free trial account on Azure and create a workspace. In order to adapt my original code to the
new task, I followed the example in https://learn.microsoft.com/en-us/azure/machine-learning/service/tutorial-train-models-with-aml?WT.mc_id=aisummit-github-amynic

The problem arises when I try to upload my locally-stored training data to the datastore of the workspace. The data is savedlocally in a parquet file, about 70Mb in size. The transfer fails after some time with a ProtocolError. After that it keeps retrying and failing with a NewConnectionError.

The snippet that reproduces the error is:

import numpy as np  
import pandas as pd  
from os.path import join as osjoin  
  
import azureml.core  
from azureml.core import Workspace,Experiment,Dataset,Datastore  
from azureml.core.compute import AmlCompute,ComputeTarget  
  
workdir = "."  
# Set up Azure Workspace  
# load workspace configuration from the config.json file in the current folder.  
try:  
    ws = Workspace.from_config()  
except:  
    print("Could not load AML workspace")  
  
  
datadir= osjoin(workdir,"data")  
local_files = [ osjoin(datadir,f) for f in listdir(datadir) if ".parquet" in f ]  
  
# get the datastore to upload prepared data  
datastore = ws.get_default_datastore()  
datastore.upload_files(files=local_files, target_path=None, show_progress=True)  
  
  

Everything runs smoothly until the last line. What happens is that the program starts to upload the file,
I can see that there is outbound traffic from my VPN monitor. From the upload speed and the size of the file, I would say that it uploads it completely or close to that, then I get this message * :

WARNING - Retrying (Retry(total=2, connect=3, read=2, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', OSError("(10054, 'WSAECONNRESET')"))': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=1, connect=2, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002210A8BAF48>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=0, connect=1, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002210B446748>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=2, connect=2, read=3, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002210A8B5148>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=1, connect=1, read=3, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000002210A891288>, 'Connection to creditfraudws2493375317.blob.core.windows.net timed out. (connect timeout=20)')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
WARNING - Retrying (Retry(total=0, connect=0, read=3, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002210A8BD3C8>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /azureml-blobstore-xxx/creditcard.parquet?comp=block&blockid=TURBd01...TURB...RA%3D%3D  
  

From the initial ProtocolError, I understand that the Azure cloud server bounces me back, but it is
unclear to me why. Checking the workspace from the Azure portal, I would guess that the container of the workspace is still empty, but I am not 100% sure if I checked that correctly.

Maybe I misunderstood the different components of the storage services in AzureML and I not using
the API correctly. Am I doing something wrong? Is there a way for me to extract more information about
the reasons for this error?

Thanks a lot in advance for any help you can provide

[*] (I manually edited portions of the error message obfuscating the blobstore name)

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,560 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Jason Koh 6 Reputation points
    2020-10-21T22:53:06.057+00:00

    @romungi-MSFT

    I am having the same issue. With the tutorial here: https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-bring-data

       WARNING - Retrying (Retry(total=0, connect=3, read=0, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection abo  
       rted.', timeout('The write operation timed out'))': /azureml-blobstore-6cf75ce5-9da9-4149-bcfd-c844582dc038/datasets/cifar10/cifar-10-batche  
       s-py/test_batch  
       Uploading ./data/cifar-10-batches-py/data_batch_2  
       \--- Logging error ---  
       Traceback (most recent call last):  
         File "/home/user01/ws/azure-ml-tutorial/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen  
           chunked=chunked,  
         File "/home/user01/ws/azure-ml-tutorial/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 392, in _make_request  
           conn.request(method, url, **httplib_request_kw)  
         File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1252, in request  
           self._send_request(method, url, body, headers, encode_chunked)  
         File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1298, in _send_request  
           self.endheaders(body, encode_chunked=encode_chunked)  
         File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1247, in endheaders  
           self._send_output(message_body, encode_chunked=encode_chunked)  
         File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 1065, in _send_output  
           self.send(chunk)  
         File "/home/user01/anaconda3/lib/python3.7/http/client.py", line 987, in send  
           self.sock.sendall(data)  
         File "/home/user01/anaconda3/lib/python3.7/ssl.py", line 1034, in sendall  
           v = self.send(byte_view[count:])  
         File "/home/user01/anaconda3/lib/python3.7/ssl.py", line 1003, in send  
           return self._sslobj.write(data)  
       socket.timeout: The write operation timed out  
         
       During handling of the above exception, another exception occurred:  
    

    Three small files are successfully loaded, but it failed at the other actual data files.

    I'm using Ubuntu18.04, Python 3.7.6. I'm using a home-wifi which I don't think have a firewall for this.

    Any idea?


  2. Maxime Lemeitre 1 Reputation point
    2021-10-18T12:05:25.383+00:00

    Each Azure ML workspace comes with a default datastore:

    from azureml.core import Workspace
    ws = Workspace.from_config()
    datastore = ws.get_default_datastore()
    

    When declaring BlobService pass in protocol='http' to force the service to communicate over HTTP. Note that you must have your container configured to allow requests over HTTP (which it does by default).

    client = BlobService(STORAGE_ACCOUNT, STORAGE_KEY, protocol="http")
    

    You can also read more about os path expanduser method. Some example code can be found here: https://gist.github.com/drdarshan/92fff2a12ad9946892df

    0 comments No comments