Partition Key in Azure Event Hub and Stream Analytics

Question

Partition Key in Azure Event Hub and Stream Analytics

Karl Gardner 85

Hello, I am reading about Azure Event Hub with the ability to use a partition key to map incoming event data into specific partitions:

https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-scalability#mapping-of-events-to-partitions

I am wondering if this replaces the metadata tag "PartitionId" in the event in event hub? Normally, the events will be partitioned in a round robin fashion and have metadata including the PartitionId field in addition to the event data. For example, see the .Net SDK for PartitionContext:

https://learn.microsoft.com/en-us/dotnet/api/azure.messaging.eventhubs.consumer.partitioncontext.partitionid?view=azure-dotnet

Here, it has a PartitionId property which is actually just a field in the event in EventHub. Therefore, when a partition key is specified to map incoming event data to specific partitions I would assume the PartitionId field will stay the same in Event Hub (the PartitionId is just resolved from the hashed partiton key) and a Partition Key field is added to the metadata?

If the PartitionId field doesn't change and there is no field added, here is my follow up question. How can we add a Partition Key in an Event Hub input to Stream Analytics if their is no information about what the partition key is in Event Hub:

User's image

Accepted answer

0 additional answers

Your answer

Answer 1

@Karl Gardner - Thanks for the question and using MS Q&A platform.

In the output configuration for Azure Stream Analytics, the Partition Key column is optional for Event Hub output. If you specify a Partition Key column, Stream Analytics will use the value in that column to determine the partition to which the event should be written in Event Hub. If you don't specify a Partition Key column, Stream Analytics will use the default round-robin partitioning scheme to determine the partition to which the event should be written.

The following table has the parameters needed to configure data streams from event hubs as an output.

User's image

When a partition key is specified to map incoming event data to specific partitions, the PartitionId field will stay the same in Event Hub and a Partition Key field is added to the metadata. The Partition Key is a sender-supplied value passed into an event hub and is processed through a static hashing function, which creates the partition assignment.

Regarding your follow-up question, to add a Partition Key in an Event Hub input to Stream Analytics, you can use the PARTITION BY clause in the query. Here is an example:

SELECT *
INTO output
FROM input
PARTITION BY PartitionKey

In this example, "PartitionKey" is the name of the field in the input data that contains the partition key. You can replace "PartitionKey" with the actual name of the field in your input data that contains the partition key.

For more details, refer to Mapping of events to partitions and Use repartitioning to optimize processing with Azure Stream Analytics

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Karl Gardner 85 Reputation points

2024-05-27T14:51:55.9133333+00:00

Hello PRADEEPCHEEKATLA-MSFT ,

Thanks for answering the question. I guess I am still not understanding some parts. Let's not take stream analytics into account at all and just drill down Event Hubs. So, if I specify a particular partition key when sending events to Event Hub:

https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-features#mapping-of-events-to-partitions

Event hub will use a static hashing function to place that event into a particular partition, then it will add a Partition Key field to the metadata of the event. So the metadata will have both PartitionId and PartitionKey in it?
PRADEEPCHEEKATLA 90,641 Reputation points Moderator

2024-05-28T02:48:13.2433333+00:00

@Karl Gardner - When you specify a particular partition key when sending events to Event Hub, Event Hub will use a static hashing function to place that event into a particular partition, and then it will add a Partition Key field to the metadata of the event. So the metadata will have both PartitionId and PartitionKey in it.

The PartitionId field is added by Event Hub and is used to identify the partition to which the event was sent. The PartitionKey field is added by the sender and is used to map incoming event data into specific partitions.

Hope this helps. Do let us know if you any further queries.

Share via

Partition Key in Azure Event Hub and Stream Analytics

0 additional answers

Your answer