If train Tensorflow in Azure machine learning possible?

Kaichson 40 Reputation points

Hello experts, I like the new azure ml system and I want to use it for my models training. However, I need a Tensorflow model. My question is can I use azure to train external format models like Tensorflow, if yes then how? I can’t see any option to do so.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,230 questions
0 comments No comments
{count} votes

Accepted answer
  1. YutongTie-MSFT 39,441 Reputation points


    Thanks for reaching out to us. Yes, you can use Azure Machine Learning to train TensorFlow models. In fact, Azure Machine Learning provides a Python SDK that you can use to train TensorFlow models at scale.

    To train a TensorFlow model using Azure Machine Learning, you can follow these general steps:

    1. Create an Azure Machine Learning workspace. You can create a workspace using the Azure portal, Azure CLI, or Azure PowerShell.
    2. Create a compute target. A compute target is a resource that you use to run your training script. Azure Machine Learning supports a variety of compute targets, including Azure Machine Learning compute, Azure Kubernetes Service (AKS), and Azure Batch AI.
    3. Prepare your training data. You can use Azure Machine Learning to preprocess your data and store it in a datastore.
    4. Write your TensorFlow training script. Your script should define your model, load your data, and train your model.
    5. Create an estimator. An estimator is an object that encapsulates your training script and specifies the configuration of your training run.
    6. Submit your training run. You can submit your training run using the Azure Machine Learning Python SDK.
    7. Monitor your training run. You can monitor your training run using the Azure Machine Learning Python SDK or the Azure portal.
    8. Retrieve your trained model. Once your training run is complete, you can retrieve your trained model and use it for inference.

    Below is a quick code sample -

    from azure.ai.ml.entities import AmlCompute
    gpu_compute_target = "gpu-cluster"
        # let's see if the compute target already exists
        gpu_cluster = ml_client.compute.get(gpu_compute_target)
            f"You already have a cluster named {gpu_compute_target}, we'll reuse it as is."
    except Exception:
        print("Creating a new gpu compute target...")
        # Let's create the Azure ML compute object with the intended parameters
        gpu_cluster = AmlCompute(
            # Name assigned to the compute cluster
            # Azure ML Compute is the on-demand VM service
            # VM Family
            # Minimum running nodes when there is no job running
            # Nodes in cluster
            # How many seconds will the node running after the job termination
            # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination
        # Now, we pass the object to MLClient's create_or_update method
        gpu_cluster = ml_client.begin_create_or_update(gpu_cluster).result()
        f"AMLCompute with name {gpu_cluster.name} is created, the compute size is {gpu_cluster.size}"

    You can find more detailed information on how to train TensorFlow models using Azure Machine Learning in the following documentation: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-tensorflow.

    I hope this helps.



    -Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.

    0 comments No comments

0 additional answers

Sort by: Most helpful