About Event Hubs partitions

Jona 455 Reputation points
2024-05-28T00:13:01.1533333+00:00

Hi,

I'm facing a challenge and a colleague of mine suggested to me to use partitions on Event Hubs. So, I've reading a lot and have some questions:

  1. ¿How are partitions related to Event Hub capacity? Let's suppose an Standard Tier with 5 TU ... ¿How is capacity spread accross all the partitions?
  2. ¿is there any limitation on partitions? For example, Event hubs has a publication limit of 1MB
  3. ¿Do partitions affect Event hub pricing?
  4. ¿What about the traffic being unpredictable or not uniform?.. ¿will this cause hot partitions?, ¿are hoy partitions bad for a partition strategy?
  5. Any other further informations about partition would be helpfull, or architecture desing or pattern from Microsoft.

Regards

Azure Event Hubs
Azure Event Hubs
An Azure real-time data ingestion service.
583 questions
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 83,886 Reputation points Microsoft Employee
    2024-05-29T08:12:56.2533333+00:00

    @Jona - Let me clarify my previous response.

    Regarding the statement "partitions allow for multiple parallel logs to be used for the same event hub, which multiplies the available raw IO throughput capacity", what I meant to say is that partitions allow for multiple parallel logs to be used for the same event hub, which multiplies the available raw IO throughput capacity of the underlying storage and its replicas. This is because maintaining a log that preserves the order of events requires that these events are being kept together in the underlying storage and its replicas, and partitioning allows for multiple parallel logs to be used for the same event hub, which multiplies the available raw IO throughput capacity of the underlying storage and its replicas.

    Regarding your question about capacity spread across all the partitions, you are correct that there is no limitation in capacity or bandwidth on partitions. The only limitation is the number of partitions by SKU.

    Regarding pricing, I apologize for the confusion. The number of partitions does not affect the pricing of an event hub. The pricing of an event hub depends on the number of pricing units (throughput units) that you choose for your event hub. In general, we recommend a maximum throughput of 1 MB/s per partition. Therefore, a rule of thumb for calculating the number of partitions would be to divide the maximum expected throughput by 1 MB/s.

    So, in your case, if you want 5 TU of peak capacity, you should provision the number of partitions based on the expected peak load of your application. If you expect a peak load of 5 MB/s, you should provision at least 5 partitions to achieve the optimal throughput.

    Regarding hot partitions, you are correct that there can be problems caused by hot partitions. To avoid hot partitions, you can use partition keys to evenly distribute traffic across partitions.

    Hope this helps. Do let us know if you any further queries.

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 83,886 Reputation points Microsoft Employee
    2024-05-28T03:14:02.98+00:00

    @Jona - Thanks for the question and using MS Q&A platform.

    Event Hubs partitions are a great way to help with processing large volumes of events.

    Partitions are related to Event Hub capacity in two ways. First, partitions allow for multiple parallel logs to be used for the same event hub, which multiplies the available raw IO throughput capacity. Second, partitions are how your solution feeds those processes and yet ensures that each event has a clear processing owner.

    Regarding your question about capacity spread across all the partitions, we recommend that you choose at least as many partitions as you expect that are required during the peak load of your application for that particular event hub. The number of partitions is specified at the time of creating an event hub and it must be between 1 and the maximum partition count allowed for each pricing tier. For the partition count limit for each tier, you can check out this article: Azure Event Hubs quotas and limits

    There is a limit of 1 MB per event in Event Hubs, but this limit is not related to partitions. Instead, it is a limit on the size of each individual event that is published to an event hub.

    Regarding pricing, the number of partitions does affect the pricing of an event hub. It depends on the number of pricing units (throughput units) that you choose for your event hub. In general, we recommend a maximum throughput of 1 MB/s per partition. Therefore, a rule of thumb for calculating the number of partitions would be to divide the maximum expected throughput by 1 MB/s.

    If the traffic being unpredictable or not uniform, it can cause hot partitions. Hot partitions are partitions that receive a disproportionate amount of traffic compared to other partitions. This can cause processing delays and other issues. To avoid hot partitions, you can use partition keys to evenly distribute traffic across partitions.

    For more information about Event Hubs partitions, you can check out this article: Features and terminology in Azure Event Hubs - Partitions. It includes information about partitioning best practices and how to calculate the number of partitions you need.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.