Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
This feature is in Beta. It is not automatically enabled for all customers and functionality is subject to change. To request access, contact your Azure Databricks account team.
Learn how to configure endpoint telemetry to persist OpenTelemetry logs, traces, and metrics from your custom model serving endpoints to Unity Catalog tables. Use the persisted telemetry data to perform root cause analysis, monitor endpoint health, and meet compliance requirements with standard SQL queries.
Requirements
Your workspace must be enabled for Unity Catalog. Default storage (Arclight) is not supported.
You must have
USE CATALOG,USE SCHEMA,CREATE TABLE, andMODIFYpermissions on the destination Unity Catalog catalog and schema where the logs are stored.An existing custom model serving endpoint or permissions to create one.
Your workspace must be in a supported region:
canadacentralwestuswestus2southcentraluseastuseastus2centralusnorthcentralusswedencentralwesteuropenortheuropeuksouthaustraliaeastsoutheastasia
Step 1: Instrument your model code
Add instrumentation to your model code to capture telemetry.
Add application logging to your model. Endpoint telemetry automatically captures standard Python
loggingoutput. No OpenTelemetry SDK instrumentation is required for basic logging.import logging class MyCustomModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input): # This log will be persisted to the <prefix>_otel_logs table logging.warning("Received inference request") try: # Your model logic here result = model_input * 2 return result except Exception as e: # Error logs are also captured with severity 'ERROR' logging.error(f"Inference failed: {e}") raise eThe root logging level is set to
WARNING. See Troubleshooting to change the logging level.(Optional) Instrument custom metrics and traces with OpenTelemetry. To capture custom metrics and traces beyond basic logging, add OpenTelemetry SDK instrumentation to your model. Expand the following section for a complete example that shows how to create counters, record spans, and attach custom attributes.
Example: Custom metrics, spans, and model logging with OpenTelemetry
Note
Due to limitations in model serialization, you must write your model to a separate file before logging to avoid errors, as shown below using
%%writefile return_input_model.py.%%writefile return_input_model.py import os import mlflow from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter from opentelemetry.metrics import get_meter, set_meter_provider from opentelemetry.sdk.metrics import MeterProvider from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader from opentelemetry.sdk.resources import Resource from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.trace import get_tracer, set_tracer_provider # highlight-start # ---- OTel initialization (per-worker) ---- resource = Resource.create({ "worker.pid": str(os.getpid()), }) otlp_trace_exporter = OTLPSpanExporter() tracer_provider = TracerProvider(resource=resource) tracer_provider.add_span_processor(BatchSpanProcessor(otlp_trace_exporter)) set_tracer_provider(tracer_provider) otlp_metric_exporter = OTLPMetricExporter() metric_reader = PeriodicExportingMetricReader(otlp_metric_exporter) meter_provider = MeterProvider(metric_readers=[metric_reader], resource=resource) set_meter_provider(meter_provider) _tracer = get_tracer(__name__) _meter = get_meter(__name__) _prediction_counter = _meter.create_counter( name="prediction_count", description="Number of predictions made", unit="1" ) # highlight-end class ReturnInputModel(mlflow.pyfunc.PythonModel): def load_context(self, context): # highlight-start self.tracer = _tracer self.prediction_counter = _prediction_counter # highlight-end def predict(self, context, model_input): # highlight-next-line with self.tracer.start_as_current_span("ReturnInputModel.predict") as span: # highlight-next-line span.set_attribute("input_shape", str(model_input.shape)) # highlight-next-line span.set_attribute("input_columns", str(list(model_input.columns))) # highlight-next-line self.prediction_counter.add(1) return model_input mlflow.models.set_model(ReturnInputModel())Log and register the model.
import pandas as pd import mlflow from mlflow.models import infer_signature # Prepare tabular input/output for signature (pyfunc expects DataFrame) input_df = pd.DataFrame({"inputs": ["hello world"]}) output_df = input_df.copy() # model returns input unchanged # Log the model with OpenTelemetry dependencies (using code-based logging to avoid serialization issues) with mlflow.start_run(): signature = infer_signature(input_df, output_df) model_info = mlflow.pyfunc.log_model( name="model", python_model="return_input_model.py", signature=signature, input_example=input_df, pip_requirements=[ "mlflow==3.1", # highlight-next-line "opentelemetry-sdk", # highlight-next-line "opentelemetry-exporter-otlp-proto-http", ], ) # Register with serverless optimized deployment environment packing # Use Unity Catalog name: catalog.schema.model_name registered = mlflow.register_model( model_info.model_uri, MODEL_NAME, env_pack="databricks_model_serving" )
Step 2: Prepare the Unity Catalog destination
Before creating your endpoint, ensure you have a catalog and schema ready to receive the telemetry data. Azure Databricks automatically creates the necessary tables in this schema if they do not already exist.
- In Catalog Explorer, navigate to the catalog and schema you want to use (for example,
my_catalog.observability).
Step 3: Enable endpoint telemetry
You can enable telemetry when creating a new endpoint or add it to an existing one.
New endpoint
To enable telemetry in the UI:
- Navigate to Serving in the left sidebar.
- Click Create serving endpoint.
- In the Endpoint telemetry section (marked Preview), expand the configuration options.
- Unity Catalog location: Select the destination Catalog and Schema prepared in step 2.
- (Optional) Table prefix: Enter a prefix for the generated tables. If left blank, there is no prefix. The tables are named
<prefix>_otel_logs,<prefix>_otel_spans, and<prefix>_otel_metrics. - Complete the rest of the endpoint configuration (Model selection, Compute settings) and click Create.
To do this with the API:
Enable telemetry using the API
curl -X POST -H "Authorization: Bearer <your-token>" \
https://<workspace-url>/api/2.0/serving-endpoints \
-d '{
"name": "my-custom-logging-endpoint",
"config": {
"served_entities": [
{
"name": "my-model",
"entity_name": "my-model",
"entity_version": "1",
"workload_size": "Small",
"scale_to_zero_enabled": true
}
],
"telemetry_config": {
"table_names": {
"logs_table": "my_catalog.observability.custom_endpoint_logs",
"metrics_table": "my_catalog.observability.custom_endpoint_metrics",
"traces_table": "my_catalog.observability.custom_endpoint_spans"
}
}
}
}'
Existing endpoint
Note
Updating triggers a new deployment. Changes take effect once the deployment completes.
To enable telemetry in the UI:
- From the endpoint view page, on the right side panel, under the Endpoint telemetry section, click Add.
- Unity Catalog location: Select the destination Catalog and Schema prepared in step 2.
- (Optional) Table prefix: Enter a prefix for the generated tables. If left blank, there is no prefix. The tables are named
<prefix>_otel_logs,<prefix>_otel_spans, and<prefix>_otel_metrics. - Click Update.
Step 4: Verify and query telemetry data
After the endpoint receives traffic, telemetry data streams to the configured Unity Catalog tables.
Go to Catalog Explorer or the SQL Editor.
Locate the table named
<prefix>_otel_logsin your configured schema.Run a query to verify data is flowing:
SELECT * FROM <catalog>.<schema>.<prefix>_otel_logs LIMIT 10;
Query telemetry data
The following examples show common queries.
To view the full schema of any telemetry table, run:
DESCRIBE TABLE <catalog>.<schema>.<prefix>_otel_logs;
Use these columns to filter and correlate telemetry data:
timestampseverity_textbodytrace_idspan_idattributes— a map that contains event-specific metadata.
Check for errors in the last hour
SELECT
timestamp,
severity_text,
body,
attributes
FROM <catalog>.<schema>.<prefix>_otel_logs
WHERE
severity_text = 'ERROR'
AND timestamp > current_timestamp() - INTERVAL 1 HOUR
ORDER BY timestamp DESC;
Troubleshooting
Logs not appearing in table: The root logging level defaults to WARNING to reduce overhead. To capture lower-severity logs, change the level in your model code:
class MyModel(mlflow.pyfunc.PythonModel):
def load_context(self, context):
root = logging.getLogger()
root.setLevel(logging.DEBUG)
for handler in root.handlers:
handler.setLevel(logging.DEBUG)
Limitations
The following limits apply to endpoint telemetry:
Schema evolution on the target table is not supported.
Only managed Delta tables are supported. External storage and Arclight default storage are not supported.
The table location must be in the same region as your workspace.
Only table names with ASCII letters, digits, and underscores are supported.
Recreating a target table is not supported.
Only single availability zone (single-az) durability is supported.
Delivery is at-least-once. An acknowledgement from the server means the record is durable and in the Delta table.
Records must be less than 10 MB each.
Requests must be less than 30 MB each.
Log lines must be less than 1 MB each.
Telemetry latency degrades beyond 2500 QPS.
Logs appear in the Unity Catalog table a few seconds after they are emitted.