@Kenneth Huddleston - Thanks for the question and using MS Q&A platform.
Azure Databricks provides several options to export performance metrics to external tooling. Here are some of the recommended options:
Azure Monitor: You can use Azure Monitor to collect and analyze performance metrics from Azure Databricks. Azure Monitor provides a centralized platform for monitoring and alerting on performance metrics across your entire Azure environment. You can use Azure Monitor to collect metrics such as CPU utilization, network activity, and memory usage from Azure Databricks clusters and send them to external tooling such as Log Analytics or a SQL database.
Databricks REST API: You can use the Databricks REST API to programmatically retrieve performance metrics from Azure Databricks clusters. The API provides endpoints for retrieving metrics such as CPU utilization, network activity, and memory usage. You can use the API to integrate performance metrics into external tooling such as New Relic or Datadog.
Databricks Monitoring: Databricks provides built-in monitoring capabilities that allow you to monitor the performance of your clusters in real-time. You can use the Databricks monitoring UI to view metrics such as CPU utilization, network activity, and memory usage. You can also configure alerts to notify you when performance metrics exceed certain thresholds.
Databricks Metrics Export: Databricks provides a built-in metrics export feature that allows you to export performance metrics to external tooling such as Prometheus or Graphite. You can use the metrics export feature to export metrics such as CPU utilization, network activity, and memory usage.
In terms of the safest and recommended options, using Azure Monitor to collect and analyze performance metrics is a recommended approach. Azure Monitor provides a centralized platform for monitoring and alerting on performance metrics across your entire Azure environment, and it is a well-supported and reliable option. Additionally, using the Databricks REST API to programmatically retrieve performance metrics is also a recommended approach, as it provides a flexible and customizable way to integrate performance metrics into external tooling.
For more details, refer to the below links:
Monitor Model Serving endpoints with Prometheus and Datadog
Monitor Databricks with Datadog
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.