What is machine learning?

Machine learning is a data science technique that allows computers to use existing data to forecast future behaviors, outcomes, and trends. By using machine learning, computers learn without being explicitly programmed. Machine learning tools use AI systems that can identify patterns and create associations from experience with the data.

Automated machine learning forecasts or predictions can make applications and devices smarter. For example, when you shop online, machine learning helps recommend other products you might want based on what you've bought. Or, when swiping your credit card, machine learning compares the transaction to a database of transactions and helps detect fraud. And when your robot vacuum cleaner vacuums a room, machine learning helps it decide whether the job is done.

Machine learning tools to fit each task

Azure Machine Learning provides all the tools developers and data scientists need for their machine learning workflows, including:

To build end-to-end workflow pipelines, you can even use Kubeflow.

Build machine learning models in Python or R

Start training on your local machine using the Azure Machine Learning Python SDK or R. Then, you can scale out to the cloud. With many available compute targets, like Azure Machine Learning compute and Azure Databricks, and with advanced hyperparameter tuning services, you can build better models faster by using the power of the cloud. Using the SDK, you can also automate model training and tuning.

Build machine learning models with no-code tools

For code-free or low-code training and deployment, try:

MLOps: Deploy and lifecycle management

Machine learning operations (MLOps) is based on DevOps principles and practices that increase the efficiency of workflows. For example, continuous integration, delivery, and deployment. MLOps applies these principles to the machine learning process with the goal of:

  • Faster experimentation and development of models
  • Faster deployment of models into production
  • Quality assurance

When you have the right model, you can easily use it in an online endpoint. For more information, see Deploy models with Azure Machine Learning.

Then you can manage your deployed models by using the Azure Machine Learning SDK for Python, Azure Machine Learning studio, or the Azure Machine Learning CLI.

These models can be consumed and return predictions either in real time or asynchronously on large quantities of data.

And with advanced machine learning pipelines, you can collaborate on each step from data preparation, model training, and evaluation through deployment. Pipelines allow you to:

  • Automate the end-to-end machine learning process in the cloud
  • Reuse components and only rerun steps when needed
  • Use different compute resources in each step
  • Run batch scoring tasks

If you want to use scripts to automate your machine learning workflow, the Azure Machine Learning CLI provides command-line tools that perform common tasks, such as submitting a training run or deploying a model.

To start using Azure Machine Learning, see Next steps.

Automated machine learning

Data scientists spend an inordinate amount of time iterating over models during experimentation. Trying out different algorithms and hyperparameter combinations until an acceptable model is built is extremely taxing for data scientists due to the work's monotonous and non-challenging nature. While this exercise yields massive gains in model efficacy, it sometimes costs too much in terms of time and resources and thus might have a negative return on investment (ROI).

This is where automated machine learning (AutoML) comes in. It uses the concepts from the research paper on probabilistic matrix factorization. It implements an automated pipeline of trying out intelligently selected algorithms and hypermeter settings based on the heuristics of the data presented, considering the given problem or scenario. The result of this pipeline is a set of models best suited for the given problem and dataset.

For more information on AutoML, see AutoML and MLOps with Azure Machine Learning.

Managed solutions

Azure Machine Learning provides fully managed resources such as:

  • Compute instances: Cloud-based VMs pre-configured with the SDK and popular data science tools such as Jupyter Notebooks and JupyterLab. For more information, see Create and manage compute instances.
  • Compute clusters: Train models at scale using dynamically scaling Azure virtual machines' clusters. For more information, see Create and manage compute clusters.
  • Serverless compute clusters: Train models on dynamically created, dynamically scaling clusters of Azure virtual machines. For more information, see Model training on serverless compute (preview).
  • Serverless Apache Spark: Use dynamically created Apache Spark clusters for interactive data wrangling or training machine learning models. For more information, see Serverless Spark compute.
  • Managed online endpoints: Deploy models as web services that client applications can consume. For more information, see Online endpoints.
  • Managed virtual network: Provides network isolation for Azure Machine Learning managed resources and other Azure services that Azure Machine Learning relies on. For more information, see Workspace managed network isolation.

Responsible ML

Trust must be at the core throughout the development and use of AI systems. Trust in the platform, process, and models. As AI and autonomous systems integrate more into the fabric of society, it's important to proactively make an effort to anticipate and mitigate the unintended consequences of these technologies.

  • Understand your models and build for fairness: Explain model behavior and uncover features that impact predictions most. Use built-in explainers for glass and black-box models during model training and inference. Use interactive visualizations to compare models and perform what-if analysis to improve model accuracy. Test your models for fairness using state-of-the-art algorithms. Mitigate unfairness throughout the machine learning lifecycle, compare mitigated models, and make intentional fairness versus accuracy trade-offs as desired.
  • Protect data privacy and confidentiality: Build models that preserve privacy using the latest innovations in differential privacy, which injects precise levels of statistical noise in data to limit the disclosure of sensitive information. Identify data leaks and intelligently limit repeat queries to manage exposure risk. Use encryption and confidential machine learning techniques designed for machine learning to build models using confidential data securely.
  • Control and govern through every step of the machine learning process: Access built-in capabilities to automatically track lineage and create an audit trail across the machine learning lifecycle. Obtain complete visibility into the machine learning process by tracking datasets, models, experiments, code, and more. Use custom tags to implement model data sheets, document key model metadata, increase accountability, and ensure responsible process.

Learn more about implementing Responsible ML.

Integration with other services

Azure Machine Learning works with other Azure platform services and integrates with open-source tools such as Git and MLflow.

Next steps