Ibahagi sa


Configure telemetry for Databricks Apps

Important

App telemetry is in Beta.

Databricks Apps telemetry collects traces, logs, and metrics and persists them to Unity Catalog tables using the OpenTelemetry (OTel) protocol. After you enable app telemetry, Databricks automatically captures system logs and usage events such as user login and direct API requests. You can also add custom instrumentation using the OpenTelemetry SDK for your framework.

Requirements

  • Your workspace must be in a supported region: australiaeast, brazilsouth, canadacentral, centralindia, centralus, eastus, eastus2, germanywestcentral, northcentralus, northeurope, southcentralus, southeastasia, swedencentral, switzerlandnorth, uksouth, westeurope, westus, westus2, westus3.
  • To create new telemetry target tables in Unity Catalog, you need CAN MANAGE permissions on the target catalog and schema, and CREATE TABLE on the schema.
  • To write to existing telemetry target tables in Unity Catalog, you need either CAN MANAGE permissions on the target catalog and schema, or all account users must have USE CATALOG, USE SCHEMA, SELECT, and MODIFY on the target tables.
  • Target tables must be managed Delta tables in the same region as your workspace.
  • Databricks recommends enabling predictive optimization on the telemetry target tables for better query performance.

Enable app telemetry

Note

If you created an app before the app telemetry beta, you must stop and restart it before you proceed with the following configuration steps.

To turn on telemetry for an app, configure a catalog and schema for the telemetry tables in the app settings.

  1. Open the app details page in your Azure Databricks workspace.
  2. On the overview tab, locate the App telemetry configuration section and click Add.
  3. Enter or browse to select a catalog and schema. Azure Databricks writes telemetry data to three tables in the selected location: otel_metrics, otel_spans, and otel_logs.
  4. (Optional) Specify a table prefix so that tables are named <prefix>_otel_metrics, <prefix>_otel_spans, and <prefix>_otel_logs. Azure Databricks appends to existing tables or creates them if they don't exist.
  5. Click Save.
  6. Redeploy the app so that telemetry starts flowing to Unity Catalog.

Verify telemetry data

The otel_logs table is populated automatically after redeployment. The otel_spans and otel_metrics tables are only populated after you add custom instrumentation to your app.

After you redeploy the app:

  1. Visit the app URL to generate activity.

  2. Wait a few seconds for the initial batch of data to appear.

  3. Run the following query in Databricks SQL to confirm data is flowing:

    SELECT * FROM <catalog>.<schema>.otel_logs
    LIMIT 10;
    

Query telemetry data

Useful columns for filtering and correlating telemetry data include time, service_name, trace_id, span_id, and attributes. The attributes column is a map that contains event-specific metadata such as event.name.

To view the full schema of any telemetry table, run:

DESCRIBE TABLE <catalog>.<schema>.otel_logs;

The following example queries the otel_logs table for system or custom OpenTelemetry ERROR logs from within the last hour:

SELECT time, body
FROM <catalog>.<schema>.otel_logs
WHERE service_name = '<app-name>'
  AND severity_text = "ERROR"
  AND time >= current_timestamp() - INTERVAL 1 HOUR
ORDER BY time DESC
LIMIT 100;

Add custom instrumentation

Add OpenTelemetry auto-instrumentation to generate custom traces, metrics, and logs. Update your app.yaml and dependency files as shown for your framework.

Streamlit

Update app.yaml:

command: ['opentelemetry-instrument', 'streamlit', 'run', 'app.py']
env:
  - name: OTEL_TRACES_SAMPLER
    value: 'always_on'

Update requirements.txt:

streamlit==1.38.0

# Auto-instrumentation
opentelemetry-distro
opentelemetry-exporter-otlp-proto-grpc

# Required for Streamlit
opentelemetry-instrumentation-tornado

# Host metrics (CPU, memory)
opentelemetry-instrumentation-system-metrics

Dash

Update app.yaml:

command: ['opentelemetry-instrument', 'python', 'app.py']
env:
  - name: OTEL_TRACES_SAMPLER
    value: 'always_on'

Update requirements.txt:

dash
dash-bootstrap-components
pandas
plotly
databricks-sql-connector
databricks-sdk
python-dotenv
dash-ag-grid
opentelemetry-distro[otlp]
opentelemetry-instrumentation-flask
opentelemetry-exporter-otlp-proto-grpc

Flask

Update app.yaml:

command: ['opentelemetry-instrument', 'flask', '--app', 'app.py', 'run', '--no-reload']
env:
  - name: OTEL_TRACES_SAMPLER
    value: 'always_on'

Update requirements.txt:

opentelemetry-distro
opentelemetry-exporter-otlp-proto-grpc
opentelemetry-instrumentation-flask

Node.js

Create an otel.js file:

'use strict';

import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-proto';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-proto';
import { OTLPLogExporter } from '@opentelemetry/exporter-logs-otlp-proto';

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter(),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter(),
    exportIntervalMillis: 10000,
  }),
  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-fs': { enabled: false },
    }),
  ],
});

try {
  sdk.start();
} catch (e) {
  console.error('OTel SDK failed to start', e);
}

async function shutdown() {
  try {
    await sdk.shutdown();
  } catch (e) {
    console.error('OTel SDK shutdown failed', e);
  } finally {
    process.exit(0);
  }
}
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);

Update package.json:

{
  "name": "nodejs-otel",
  "version": "0.1.0",
  "private": true,
  "main": "app.js",
  "scripts": {
    "start": "node -r ./otel.js app.js"
  },
  "dependencies": {
    "express": "^4.21.2",
    "morgan": "^1.10.0",
    "@opentelemetry/api": "^1.9.0",
    "@opentelemetry/sdk-node": "0.203.0",
    "@opentelemetry/auto-instrumentations-node": "0.67.3",
    "@opentelemetry/exporter-trace-otlp-proto": "0.203.0",
    "@opentelemetry/exporter-metrics-otlp-proto": "0.203.0",
    "@opentelemetry/exporter-logs-otlp-proto": "0.203.0",
    "@opentelemetry/sdk-metrics": "2.0.1"
  }
}

Environment variables

When you enable app telemetry, Databricks automatically configures environment variables in your app runtime for the OTLP collector endpoint, export protocol, resource attributes, and batch processing. For the full list of OTel environment variables, see App telemetry environment variables.

Limits

The following limits apply to app telemetry:

Limit Value
Record size 10 MB per record
Request size 30 MB per request
Log line size 1 MB per log line
Rate limit 100 export requests per second per workspace
Durability Single availability zone only. The export pipeline might experience downtime if the zone is unavailable.
Delivery At-least-once. An acknowledgement from the server means it has durably written the record to the Delta table.

Limitations

App telemetry has the following limitations:

  • Table names support ASCII letters, digits, and underscores only.
  • Target tables can't use Arclight default storage.
  • You can't recreate a target table.
  • App telemetry doesn't support schema evolution on target tables.