Databricks Audit Logs, Where the log files are stored? How to read them?

Mohammad Saber 591 Reputation points
2023-02-15T22:29:26.9266667+00:00

Hi,

 

I want to access the Databricks Audit Logs to check user activity. For example, the number of times that a table was viewed by a user.

 

I have a few questions in this regard.

 

  1. Where the log files are stored? Are they stored on DBFS?

 

  1. Can I read log files and save them as a table (let's say a delta table)?

If Log files are saved somewhere like DBFS, I might be able to read files by SQL language. 

 

I'd like to know if there is any way to get Logs as a Databricks table. I mean, saving the Logs as a table.

 

Also, I want it to work continuously; adding new logs to the table when a new event happens (not just one time).

 

I am not sure if I can use SQL language for this purpose or not (instead of Rest API).

 

Any idea how to do that?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,955 questions
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 78,986 Reputation points Microsoft Employee
    2023-02-16T11:05:42.9966667+00:00

    Hello @Mohammad Saber

    Thanks for the question and using MS Q&A platform.

    Azure Databricks provides comprehensive end-to-end diagnostic logs of activities performed by Azure Databricks users, allowing your enterprise to monitor detailed Azure Databricks usage patterns.

    For a list of each of these types of events and the associated services, see Events. Some of the events are emitted in audit logs only if verbose audit logs are enabled for the workspace.

    By default all the logs are available on the Azure Databricks cluster, if you want to download to local.

    There are different ways to copy driver logs to your local machine.

    Option1: Cluster Driver Logs:

    Go to Azure Databricks Workspace => Select the cluster => Click on Driver Logs => To download to local machine.

    User's image

    The direct print and log statements from your notebooks and libraries goes to the driver logs. The logs have three outputs:

    • Standard output
    • Standard error
    • Log4j logs

    The log files are rotated periodically. Older log files appear at the top of the page, listed with timestamp information. You can download any of the logs for troubleshooting.

    Option2: Cluster Log Delivery:

    When you create a cluster, you can specify a location to deliver Spark driver and worker logs. Logs are delivered every five minutes to your chosen destination. When a cluster is terminated, Databricks guarantees to deliver all logs generated up until the cluster was terminated.

    The destination of the logs depends on the cluster ID. If the specified destination is dbfs:/cluster-log-delivery, cluster logs for 0630-191345-leap375 are delivered to dbfs:/cluster-log-delivery/0630-191345-leap375.

    To configure the log delivery location:

    1. On the cluster configuration page, click the Advanced Options toggle.
    2. At the bottom of the page, click the Logging tab.
    3. Select a destination type.
    4. Enter the cluster log path.

    User's image

    To Download the Cluster Logs to Local Machine:

    Install the Databricks CLI, configure it with your Databricks credentials, and use the CLI's dbfs cp command. For example: dbfs cp dbfs:/FileStore/azure.txt ./azure.txt.

    If you want to download an entire folder of files, you can use dbfs cp -r <DBFS Path> <LocalPath>.

    Databricks Host (should begin with https://): https://centralus.azuredatabricks.net/

    Username: username@microsoft.com

    Password: paste Access token

    Repeat for confirmation: paste Access token

    • Now Run the below cmdlet to copy logs to local machine

    dbfs cp -r dbfs:/cluster-logs/0731-081420-tees851/driver C:\Users\Azure\Desktop\Logs

    enter image description here

    For more details, refer to Diagnostic logging in Azure Databricks

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 78,986 Reputation points Microsoft Employee
    2023-02-20T08:17:51.59+00:00

    Hello @Mohammad Saber

    For SQL persona, you don't get logs ingested into Log Analytics.
    Monitor a SQL warehouse: You can view the number of queries handled by the warehouse and the number of clusters allocated to the warehouse.

    1. Click Endpoints Icon-5 SQL Warehouses in the sidebar.
    2. Click a SQL warehouse.
    3. Click Monitoring.

    ADB-SQL-Query

    OR

    Directly from Query History:

    User's image

    Hope this helps.