Enhancing Event Processing Efficiency

Maryna Paluyanava 211 Reputation points
2024-06-06T14:56:20.08+00:00

Hello,

I have documents that I read and use to create events of different types. The system_id serves as a unique identifier for each event and will be passed as a parameter for the vertex. I send these events to Event Grid with an endpoint at Azure Storage Queue. The eventgrid_trigger Azure function is used to load these events into Cosmos DB (Gremlin) as graph vertices. Before loading, I check if a vertex with the same system_id exists in the database. If it does, I update the properties for this vertex. If not, I load a new vertex.

I require loading events into the database sequentially to prevent the scenario where the same vertex is loaded multiple times. For this reason, I have configured the host.json file as follows:

{
  "version": "2.0",
  "functionTimeout": "00:10:00",
  "logging": {
    "applicationInsights": {
        "samplingSettings": {
          "isEnabled": true,
          "excludedTypes": "Request"
        }
      }
  },
  "extensions": {
    "queues": {
        "maxPollingInterval": "00:00:02",
        "visibilityTimeout" : "00:00:01",
        "batchSize": 1,
        "maxDequeueCount": 5,
        "newBatchThreshold": 0,
        "messageEncoding": "base64"
    }
},
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[4.*, 5.0.0)"
  }
}

Additionally, I have set "WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT": "1" in local.settings.json to prevent multiple VMs from running simultaneously. Although the data are loading into the database without errors like before (such as TooManyRequests or PreconditionFailedException), the loading time is too long.

Could you please suggest how to improve the process? Perhaps switching to Event Hub and sending different types of events (there are 6 types of events) to different containers in the Storage account and then process them using different blob_trigger Azure functions could be beneficial?

I work with Azure Functions Consumption plan, there are around 500 000 vertices.

Many thanks

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,990 questions
Azure Event Grid
Azure Event Grid
An Azure event routing service designed for high availability, consistent performance, and dynamic scale.
381 questions
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.