Machine learning on Azure Databricks

Build, deploy, and manage machine learning applications on Azure Databricks, an integrated platform that unifies the entire ML lifecycle from data preparation to production monitoring.

Looking for generative AI and AI agents? See Build AI agents on Azure Databricks.

Train classic machine learning models

Create machine learning models with automated tools and collaborative development environments.

Feature Description
AutoML Automatically build high-quality models with minimal code using automated feature engineering and hyperparameter tuning.
Databricks Runtime for ML Pre-configured clusters with TensorFlow, PyTorch, Keras, and GPU support for deep learning development.
MLflow tracking Track experiments, compare model performance, and manage the complete model development lifecycle.
Feature engineering Create, manage, and serve features with automated data pipelines and feature discovery.
Databricks notebooks Collaborative development environment with support for Python, R, Scala, and SQL for ML workflows.

Train deep learning models

Use built-in frameworks to develop deep learning models.

Feature Description
Distributed training Examples of distributed deep learning using Ray, TorchDistributor, and DeepSpeed.
Best practices for deep learning on Databricks Best practices for deep learning on Databricks.
PyTorch Single-node and distributed training using PyTorch.
TensorFlow Single-node and distributed training using TensorFlow and TensorBoard.

Deploy and serve models

Deploy models to production with scalable endpoints, real-time inference, and enterprise-grade monitoring.

Feature Description
Model Serving Deploy custom models and LLMs as scalable REST endpoints with automatic scaling and GPU support.
AI Gateway Govern and monitor access to generative AI models with usage tracking, payload logging, and security controls.
External models Integrate third-party models hosted outside Databricks with unified governance and monitoring.
Foundation model APIs Access and query state-of-the-art open models hosted by Databricks.

Monitor and govern ML systems

Ensure model quality, data integrity, and compliance with comprehensive monitoring and governance tools.

Feature Description
Unity Catalog Govern data, features, models, and functions with unified access control, lineage tracking, and discovery.
Data profiling Monitor data quality, model performance, and prediction drift with automated alerts and root cause analysis.
Anomaly detection Monitor data freshness and completeness at the catalog level.
MLflow for Models Track, evaluate, and monitor generative AI applications throughout the development lifecycle.

Productionize ML workflows

Scale machine learning operations with automated workflows, CI/CD integration, and production-ready pipelines.

Feature Description
Models in Unity Catalog Use the model registry in Unity Catalog for centralized governance and to manage the model lifecycle, including deployments.
Lakeflow Jobs Build automated workflows and production-ready ETL pipelines for ML data processing.
Ray on Databricks Scale ML workloads with distributed computing for large-scale model training and inference.
MLOps workflows Implement end-to-end MLOps with automated training, testing, and deployment pipelines.
Git integration Version control ML code and notebooks with seamless Git integration and collaborative development.