Log4j databricks

Vineet S 1,390 Reputation points
2024-02-24T14:36:34.4266667+00:00

Hey, got to know something about log4j in databricks

class SampleApp(LoggerProvider):
def init(self):
self.spark = SparkSession.builder.getOrCreate()
self.logger = self.get_logger(self.spark)

def launch(self):
self.logger.debug("some debugging message")
self.logger.info("some info message")
self.logger.warn("some warning message")
self.logger.error("some error message")
self.logger.fatal("some fatal message") any way we can track it and store in db or hivemetastore level

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,514 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Dillon Silzer 57,826 Reputation points Volunteer Moderator
    2024-02-24T17:57:06.1433333+00:00

    Hi Vineet,

    Here is a perfect example of what you are looking for:

    https://pypi.org/project/azure-storage-logging/

    Logging handlers to send logs to Microsoft Azure Storage:

    import logging
    from azure_storage_logging.handlers import TableStorageHandler
    
    # configure the handler and add it to the logger
    logger = logging.getLogger('example')
    handler = TableStorageHandler(account_name='mystorageaccountname',
                                  account_key='mystorageaccountkey',
                                  extra_properties=('%(hostname)s',
                                                    '%(levelname)s'))
    logger.addHandler(handler)
    
    # output log messages
    logger.info('info message')
    logger.warning('warning message')
    logger.error('error message')
    

    If this is helpful please accept answer.


  2. PRADEEPCHEEKATLA 90,641 Reputation points Moderator
    2024-02-28T05:46:23.76+00:00

    @Vineet S - Thanks for the question and using MS Q&A platform.

    Yes, you can configure log4j in Azure Databricks to store the logs in a database or Hive metastore. Here are the high-level steps to configure log4j in Azure Databricks:

    • Create a log4j configuration file with the desired log levels and appenders. You can specify the desired appender to store the logs in a database or Hive metastore.
    • Upload the log4j configuration file to DBFS (Databricks File System).
    • Configure the log4j properties in your Databricks cluster by specifying the location of the log4j configuration file in DBFS.
    • Restart the Databricks cluster to apply the log4j configuration changes.

    Once log4j is configured, you can use the logger in your code to log messages at different log levels. The logs will be stored in the configured appender.

    Here is an example log4j configuration file that stores the logs in a Hive metastore:

    log4j.rootLogger=INFO, hive
    
    log4j.appender.hive=org.apache.log4j.jdbc.JDBCAppender
    log4j.appender.hive.URL=jdbc:hive2://<hostname>:<port>/<database>
    log4j.appender.hive.driver=org.apache.hive.jdbc.HiveDriver
    log4j.appender.hive.user=<username>
    log4j.appender.hive.password=<password>
    log4j.appender.hive.sql=INSERT INTO logs (timestamp, level, logger, message) VALUES('%d{yyyy-MM-dd HH:mm:ss.SSS}', '%p', '%c', '%m')
    

    In this example, the logs are stored in a Hive table named logs. You can modify the SQL statement to store the logs in a different table or database.

    Follow-up question: handler = TableStorageHandler(account_name='mystorageaccountname .... Is it storing in hive merastore.. Will it automatically works for all functions

    No, the TableStorageHandler is not storing logs in a Hive metastore. It is used to store logs in Azure Table Storage.

    The TableStorageHandler is a logging handler provided by the azure-storage-logging package. It can be used to store logs in Azure Table Storage, which is a NoSQL database service provided by Azure.

    To use the TableStorageHandler, you need to create an instance of the handler and configure it with the desired Azure Table Storage account details. Then, you can add the handler to your logger to store the logs in Azure Table Storage.

    Here is an example code snippet that shows how to use the TableStorageHandler:

    from azure.storage.logging import TableStorageHandler
    import logging
    
    # Create an instance of the TableStorageHandler
    handler = TableStorageHandler(account_name='mystorageaccountname', account_key='myaccountkey', table_name='mytablename')
    
    # Create a logger and add the handler
    logger = logging.getLogger('mylogger')
    logger.addHandler(handler)
    
    # Log some messages
    logger.debug('Debug message')
    logger.info('Info message')
    logger.warning('Warning message')
    logger.error('Error message')
    logger.critical('Critical message')
    

    In this example, the logs will be stored in the mytablename table in the mystorageaccountname Azure Table Storage account.

    To use the TableStorageHandler in all functions, you can create a separate module that initializes the logger with the TableStorageHandler and import it in all your functions.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.