About correct set up of OpenCensus

Jona 475 Reputation points
2024-07-10T16:04:32.3866667+00:00

Hi every one,

I have a function blob trigger based with event grid as source. I'm trying to send costom data (customDimension) to Log Analytics via App Insights

This is an example code setup

from services.blob_service import BlobService
from services.queue_service import QueueService
from opencensus.ext.azure.log_exporter import AzureLogHandler

EXPERIMENT_NAME = 'CSV Compressed'
logger = logging.getLogger(__name__)
logger.addHandler(AzureLogHandler())

def checker_fn(blob:func.InputStream, context:func.Context):

    blob_content = blob.read()
    size_mb = round(len(blob_content) / (1024 * 1024), 2)
    logger.info(f'File downloaded | {blob.name} | {size_mb:,} MB', extra={
        'custom_dimensions' : {
            'experiment' : EXPERIMENT_NAME,
            'blob' : blob.name,
            'size_mb' : size_mb, 
        }       
    })

However, while watching the logs, I see two entries of the same log message; one with the custom dimension and the other that hasn´t it.

log1

The entry that doesn't have operation_Name contains my custom data in the customDimension column. I'm following this example and this doc.

I know OpenCensus is deprecated, but this is legacy code that need to work as it is for now.

Why is this?

This my host.json file

{
  "version": "2.0",
  "logging": {
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": true,
        "excludedTypes": "Request"
      }
    }
  },
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[4.*, 6.0.0)"
  },
  "extensions": {
    "eventHubs": {
        "maxEventBatchSize" : 500,
        "minEventBatchSize" : 200,
        "maxWaitTime" : "00:0:05",
        "clientRetryOptions":{
            "mode" : "exponential",
            "tryTimeout" : "00:01:00",
            "delay" : "00:00:00.80",
            "maximumDelay" : "00:01:00",
            "maximumRetries" : 1
        }
    },
    "queues": {
      "maxPollingInterval": "00:00:02",
      "visibilityTimeout" : "00:02:00",
      "batchSize": 16,
      "maxDequeueCount": 1,
      "newBatchThreshold": 8,
      "messageEncoding": "base64"
    }
  }
}

Regards

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
3,331 questions
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,145 questions
0 comments No comments
{count} votes

4 answers

Sort by: Most helpful
  1. Sina Salam 12,491 Reputation points
    2024-07-10T17:14:15.0666667+00:00

    Hello Jona,

    Welcome to the Microsoft Q&A and thank you for posting your questions here. With kind of your detail explanations thank you once again.

    Problem

    I understand that you are having issue of duplicate log entries in Azure Application Insights where one entry includes the custom dimensions, and the other does not.

    Solution

    To address the issue of duplicate log entries in Azure Application Insights where one entry includes the custom dimensions and the other does not, I examined a few areas of your setup as posted. Below are some of the potential causes and solutions:

    • Ensure that the AzureLogHandler is correctly configured and attached to the logger and the custom dimensions should be properly included in the log entries sent to Application Insights. This will help to affirm the AzureLogHandler is only added once to prevent duplicate log entries and confirming that custom dimensions are consistently included in the log entries.
    • Your host.json configuration appears to be set up correctly for sampling settings, but it's worth verifying that there are no additional configurations that might be affecting the logging behavior.
    • Sometimes duplicate log entries can occur if the logger is added multiple times. Check your code to ensure that the logger is only being added once.
    • Since OpenCensus is deprecated, there might be some underlying issues with how it handles custom dimensions. Upgrading to the newer OpenTelemetry SDK would be ideal, but if that's not an option, you can try the following workaround to ensure the custom dimensions are consistently included.

    The below is an updated example of your function with some adjustments to ensure custom dimensions are properly logged:

    import logging
    from opencensus.ext.azure.log_exporter import AzureLogHandler
    import azure.functions as func
    EXPERIMENT_NAME = 'CSV Compressed'
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.INFO)
    # Ensure the AzureLogHandler is only added once
    if not any(isinstance(handler, AzureLogHandler) for handler in logger.handlers):
        logger.addHandler(AzureLogHandler())
    def checker_fn(blob: func.InputStream, context: func.Context):
        blob_content = blob.read()
        size_mb = round(len(blob_content) / (1024 * 1024), 2)
        custom_dimensions = {
            'experiment': EXPERIMENT_NAME,
            'blob': blob.name,
            'size_mb': size_mb,
        }
        # Log the message with custom dimensions
        logger.info(f'File downloaded | {blob.name} | {size_mb:,} MB', extra={'custom_dimensions': custom_dimensions})
    

    You can also go to your Application Insights, check the logs to see if there are still duplicate entries. Use Kusto Query Language (KQL) in Application Insights to filter and inspect the log entries. For example:

       traces
       | where message contains "File downloaded"
       | order by timestamp desc
    

    Accept Answer

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.

    ** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.

    Best Regards,

    Sina Salam


  2. Sina Salam 12,491 Reputation points
    2024-07-11T02:36:02.65+00:00

    Hello @Jona

    Based on the solution provided this is a continuation of your previous question asking code references.

    1. How you can modify your function_app.py to include the logger setup in function_app.py:
    import azure.functions as func
    import logging
    from opencensus.ext.azure.log_exporter import AzureLogHandler
    from function import checker_fn, splitter_fn, publisher_fn, consumer_fn, error_handler_fn
    from services.env_loader import EnvLoader
    # Logger configuration
    EXPERIMENT_NAME = 'CSV Compressed'
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.INFO)
    # Ensure the AzureLogHandler is only added once
    if not any(isinstance(handler, AzureLogHandler) for handler in logger.handlers):
        logger.addHandler(AzureLogHandler())
    event_hub_name = EnvLoader.get_value("EVENT_HUB")
    schedule_setting = EnvLoader.get_value("SCHEDULE_SETTING")
    app = func.FunctionApp()
    @app.blob_trigger(arg_name="blob", path="landing/{name}.csv", connection="StorageConnection", source="EventGrid")
    def checker(blob: func.InputStream, context: func.Context) -> None:
        checker_fn(blob=blob, context=context, logger=logger)
    @app.blob_trigger(arg_name="blob", path="stage/{name}.csv", connection="StorageConnection", source="EventGrid")
    def splitter(blob: func.InputStream, context: func.Context) -> None:
        splitter_fn(blob=blob, context=context, logger=logger)
    @app.blob_trigger(arg_name="blob", path="rdz/{name}.csv", connection="StorageConnection", source="EventGrid")
    def publisher(blob: func.InputStream, context: func.Context) -> None:
        publisher_fn(blob=blob, context=context, logger=logger)
    @app.event_hub_message_trigger(arg_name="message", event_hub_name=event_hub_name, connection="EventHubConnection") 
    def consumer(message: func.EventHubEvent, context: func.Context):
        consumer_fn(message=message, context=context, logger=logger)
    @app.timer_trigger(arg_name="timer", run_on_startup=False, schedule=schedule_setting)
    def error_handler(timer: func.TimerRequest, context: func.Context) -> None:
        error_handler_fn(timer=timer, context=context, logger=logger)
    

    About your function.py update your function definitions to accept the logger as a parameter:

    import logging
    from opencensus.ext.azure.log_exporter import AzureLogHandler
    import azure.functions as func
    EXPERIMENT_NAME = 'CSV Compressed'
    def checker_fn(blob: func.InputStream, context: func.Context, logger: logging.Logger):
        blob_content = blob.read()
        size_mb = round(len(blob_content) / (1024 * 1024), 2)
        custom_dimensions = {
            'experiment': EXPERIMENT_NAME,
            'blob': blob.name,
            'size_mb': size_mb,
        }
        # Log the message with custom dimensions
        logger.info(f'File downloaded | {blob.name} | {size_mb:,} MB', extra={'custom_dimensions': custom_dimensions})
    def splitter_fn(blob: func.InputStream, context: func.Context, logger: logging.Logger):
        # Your implementation here
        pass
    def publisher_fn(blob: func.InputStream, context: func.Context, logger: logging.Logger):
        # Your implementation here
        pass
    def consumer_fn(message: func.EventHubEvent, context: func.Context, logger: logging.Logger):
        # Your implementation here
        pass
    def error_handler_fn(timer: func.TimerRequest, context: func.Context, logger: logging.Logger):
        # Your implementation here
        pass
    

    Regards,

    Sina


  3. Jona 475 Reputation points
    2024-07-23T04:51:54.34+00:00

    @navba-MSFT can you give a hand on this?

    regards

    0 comments No comments

  4. Jona 475 Reputation points
    2024-10-02T16:25:28.74+00:00

    Hi @Sina Salam

    Just coming here to see any update. I've tried many ways with no success. In this Github the error was reported, with no resolution

    https://github.com/Azure/azure-functions-python-worker/issues/694

    I really need a way to send custom dimentions to monitor my solution properly ... I think since it's no my code problem but Azure SKD, I kinldy ask for a support ticket ....

    I think that custom dimensions logging should bee seamless, without any workaround. If not, is on Azure SDK ...

    Regards

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.