How to optimize amount of data sent via LogsIngestionClient.upload operation

Ashwin Venkatesha 60 Reputation points
2024-03-26T01:19:20.7733333+00:00

Hi,
I am using logs ingestion client in python to upload data.

My usecase is to read messages off of aws sqs and build payloads that can be sent via LogsIngestionClient client.

I built a simple timer trigger function app that reads aws sqs for new notifications, parses them for link and uploads that data using LogsIngestionClient.

Here is a snippet of the code,

    async def _post_data(self, dce_endpoint, dcr_id, stream_name, credential, data):                            
        client = LogsIngestionClient(endpoint=dce_endpoint, credential=credential, logging_enable=False)
        async with client:
            try:
                await client.upload(rule_id=dcr_id, stream_name=stream_name, logs=data)
            except HttpResponseError as e:
                print(f"Upload failed: {e}")   



    async def _flush(self, data: list):
        if data:
            #data = self._split_big_request(data)
            credential = DefaultAzureCredential()
            await self._post_data(self.dce_endpoint, self.dcr_id, self.stream_name, credential, data)

What I am noticing is, although there are lots of messages in the queue, the function app is very slow in consuming the messages and number of log rows that are being processed per min are also low.

User's image

This was obtained from metrics, even though, log ingestion byte per min is increasing, the log rows received is still capped at 500, which is strange.
I am using asyncio/await/async to send data.

How can I resolve this bottleneck and have ingestion of data improved?

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,211 questions
Microsoft Sentinel
Microsoft Sentinel
A scalable, cloud-native solution for security information event management and security orchestration automated response. Previously known as Azure Sentinel.
971 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Q&A Assist
    2024-03-26T01:19:20.7933333+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    To optimize the amount of data sent via LogsIngestionClient.upload operation, you can consider tuning the transfer options of the Azure Storage client library for Python. This can be done by configuring the client library transfer options based on the environment your app runs in.

    Some considerations for tuning data transfer options include setting the chunk size for uploads and downloads, using parallelism for uploads and downloads, and using compression for uploads.

    In your specific case, you may want to experiment with adjusting the chunk size and parallelism settings to see if it improves the performance of your LogsIngestionClient.upload operation. Additionally, you may want to consider using compression for uploads to reduce the amount of data being sent over the network.


    References:

    0 comments No comments