What is machine learning?

Article
10/18/2023

Machine learning is a data science technique that allows computers to use existing data to forecast future behaviors, outcomes, and trends. By using machine learning, computers learn without being explicitly programmed. Machine learning tools use AI systems that can identify patterns and create associations from experience with the data.

Automated machine learning forecasts or predictions can make applications and devices smarter. For example, when you shop online, machine learning helps recommend other products you might want based on what you've bought. Or, when swiping your credit card, machine learning compares the transaction to a database of transactions and helps detect fraud. And when your robot vacuum cleaner vacuums a room, machine learning helps it decide whether the job is done.

Machine learning tools to fit each task

Azure Machine Learning provides all the tools developers and data scientists need for their machine learning workflows, including:

The Azure Machine Learning designer: Drag-n-drop modules to build your experiments and then deploy pipelines
Jupyter notebooks: use our tutorial series or create your own notebooks to use our SDK for Python samples.
Integration with MLflow for tracking and model management.
Prompt flow (preview) provides streamlined AI application development using Large Language Models (LLMs).
Dedicated tools for image labeling projects.
R scripts or notebooks: learn how to bring your R workloads.
The many models solution accelerator (preview) builds on Azure Machine Learning and enables you to train, operate, and manage hundreds or even thousands of machine learning models.
Visual Studio Code extension.
Machine learning CLI.
Open-source frameworks such as PyTorch, TensorFlow, scikit-learn, and many more.

To build end-to-end workflow pipelines, you can even use Kubeflow.

Build machine learning models in Python or R

Start training on your local machine using the Azure Machine Learning Python SDK or R. Then, you can scale out to the cloud. With many available compute targets, like Azure Machine Learning compute and Azure Databricks, and with advanced hyperparameter tuning services, you can build better models faster by using the power of the cloud. Using the SDK, you can also automate model training and tuning.

Build machine learning models with no-code tools

For code-free or low-code training and deployment, try:

Azure Machine Learning designer

Use the designer to prep data, train, test, deploy, manage, and track machine learning models without writing code. There's no programming required; you visually connect datasets and modules to construct your model. Try the creating a pipeline in the studio.

Learn more in the Azure Machine Learning designer overview article.
Automated machine learning (AutoML) SDK

In the easy-to-use interface, learn to create AutoML experiments.

MLOps: Deploy and lifecycle management

Machine learning operations (MLOps) is based on DevOps principles and practices that increase the efficiency of workflows. For example, continuous integration, delivery, and deployment. MLOps applies these principles to the machine learning process with the goal of:

Faster experimentation and development of models
Faster deployment of models into production
Quality assurance

When you have the right model, you can easily use it in an online endpoint. For more information, see Deploy models with Azure Machine Learning.

Then you can manage your deployed models by using the Azure Machine Learning SDK for Python, Azure Machine Learning studio, or the Azure Machine Learning CLI.

These models can be consumed and return predictions either in real time or asynchronously on large quantities of data.

And with advanced machine learning pipelines, you can collaborate on each step from data preparation, model training, and evaluation through deployment. Pipelines allow you to:

Automate the end-to-end machine learning process in the cloud
Reuse components and only rerun steps when needed
Use different compute resources in each step
Run batch scoring tasks

If you want to use scripts to automate your machine learning workflow, the Azure Machine Learning CLI provides command-line tools that perform common tasks, such as submitting a training run or deploying a model.

To start using Azure Machine Learning, see Next steps.

Automated machine learning

Data scientists spend an inordinate amount of time iterating over models during experimentation. Trying out different algorithms and hyperparameter combinations until an acceptable model is built is extremely taxing for data scientists due to the work's monotonous and non-challenging nature. While this exercise yields massive gains in model efficacy, it sometimes costs too much in terms of time and resources and thus might have a negative return on investment (ROI).

This is where automated machine learning (AutoML) comes in. It uses the concepts from the research paper on probabilistic matrix factorization. It implements an automated pipeline of trying out intelligently selected algorithms and hypermeter settings based on the heuristics of the data presented, considering the given problem or scenario. The result of this pipeline is a set of models best suited for the given problem and dataset.

For more information on AutoML, see AutoML and MLOps with Azure Machine Learning.

Managed solutions

Azure Machine Learning provides fully managed resources such as:

Compute instances: Cloud-based VMs pre-configured with the SDK and popular data science tools such as Jupyter Notebooks and JupyterLab. For more information, see Create and manage compute instances.
Compute clusters: Train models at scale using dynamically scaling Azure virtual machines' clusters. For more information, see Create and manage compute clusters.
Serverless compute clusters: Train models on dynamically created, dynamically scaling clusters of Azure virtual machines. For more information, see Model training on serverless compute (preview).
Serverless Apache Spark: Use dynamically created Apache Spark clusters for interactive data wrangling or training machine learning models. For more information, see Serverless Spark compute.
Managed online endpoints: Deploy models as web services that client applications can consume. For more information, see Online endpoints.
Managed virtual network: Provides network isolation for Azure Machine Learning managed resources and other Azure services that Azure Machine Learning relies on. For more information, see Workspace managed network isolation.

Responsible ML

Trust must be at the core throughout the development and use of AI systems. Trust in the platform, process, and models. As AI and autonomous systems integrate more into the fabric of society, it's important to proactively make an effort to anticipate and mitigate the unintended consequences of these technologies.

Understand your models and build for fairness: Explain model behavior and uncover features that impact predictions most. Use built-in explainers for glass and black-box models during model training and inference. Use interactive visualizations to compare models and perform what-if analysis to improve model accuracy. Test your models for fairness using state-of-the-art algorithms. Mitigate unfairness throughout the machine learning lifecycle, compare mitigated models, and make intentional fairness versus accuracy trade-offs as desired.
Protect data privacy and confidentiality: Build models that preserve privacy using the latest innovations in differential privacy, which injects precise levels of statistical noise in data to limit the disclosure of sensitive information. Identify data leaks and intelligently limit repeat queries to manage exposure risk. Use encryption and confidential machine learning techniques designed for machine learning to build models using confidential data securely.
Control and govern through every step of the machine learning process: Access built-in capabilities to automatically track lineage and create an audit trail across the machine learning lifecycle. Obtain complete visibility into the machine learning process by tracking datasets, models, experiments, code, and more. Use custom tags to implement model data sheets, document key model metadata, increase accountability, and ensure responsible process.

Learn more about implementing Responsible ML.

Integration with other services

Azure Machine Learning works with other Azure platform services and integrates with open-source tools such as Git and MLflow.

Compute targets such as Azure Kubernetes Service, Azure Container Instances, Azure Databricks, Azure Data Lake Analytics, and Azure HDInsight: For more information on compute targets, see What are compute targets?.
Azure Event Grid: For more information, see Consume Azure Machine Learning events.
Azure Monitor: For more information, see Monitoring Azure Machine Learning.
Data stores such as Azure Storage accounts, Azure Data Lake Storage, Azure SQL Database, Azure Database for PostgreSQL, and Azure open datasets: For more information, see Access data in Azure Storage services and Create and manage data assets.
Azure Virtual Network: For more information, see Secure experimentation and inference in a virtual network.
Azure Pipelines: For more information, see Set up MLOps with Azure DevOps.
Git repository logs: For more information, see Git integration.
MLflow: For more information, see MLflow to track metrics and deploy models.
Kubeflow: For more information, see Build end-to-end workflow pipelines.
Secure communications: Your Azure Storage account, compute targets, and other resources can be used securely inside a virtual network to train models and perform inference. For more information, see Secure experimentation and inference in a virtual network.

Next steps

Review machine learning white papers and e-books on the Azure Machine Learning.
Review AI + machine learning architectures.