iothub messages are occasionally missing from blob storage endpoint

Peter Fine 0 Reputation points
2024-07-22T16:39:30.0666667+00:00

We are sending approx 200-300 million messages per month from iotedge devices to iothub. We use message routing to write them all to avro files in blob storage, filtered into several buckets by message type (using a routing query). This has been running, apparently successfully, for a couple of years.

We also route all messages (with no query) to another kafka endpoint, which we ingest (as a raw Body column containing the message) into our data warehouse (self hosted clickhouse).

We have observed occasional batches of messages which are found in the database but not in the avro files. Yet we do not see any mention of dropped or orphaned messages in the metrics browser.

It is hard to properly quantify this since it requires comparing large datasets in different locations and formats. However we have tried copying a day's avro files into the datawarehouse and see large chunks of missing data - for example a missing minute (received by the database but not in the avro files), from one source partition, every roughly 5 minutes over an hour sample. This has been observed over different days, and might be much more widespread. We can directly see these files are missing from blob storage (using storage explorer), so we don't think this is a problem with our comparison methodology.

We're not sure how to proceed with this since there are no apparent errors, yet the data is not being written as expected.

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,822 questions
Azure IoT Hub
Azure IoT Hub
An Azure service that enables bidirectional communication between internet of things (IoT) devices and applications.
1,188 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Sina Salam 10,416 Reputation points
    2024-07-22T18:44:45.6733333+00:00

    Hello Peter Fine,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    Problem

    I understand that you are experiencing missing data occasionally in your Avro files despite successful message routing from IoT Edge devices to Azure Blob storage.

    Solution

    Most of all, thank you for providing the details about your setup and the issue you're facing. To solve the issues, try to explore some potential solutions as I listed in the followings:

    1. You will need to understand that by default, Azure IoT Hub writes messages to Azure Blob storage in Avro format. Avro has both a message body property and a message property, which can make querying data challenging. Therefore, you can specify the message format using the ContentEncoding and ContentType properties. For JSON data, set ContentEncoding to "utf-8" and ContentType to "application/json" in the message system properties and ensure that your device messages are correctly formatted with the appropriate encoding and content type.
    2. Azure Data Lake Analytics can help you query Avro data efficiently. It follows a "pay-per-query" model, which is suitable for non-relational big data. Try to set up Azure IoT Hub to route data to an Azure Blob storage endpoint. Then configure Azure Blob storage as an additional store in Data Lake Analytics. You can use U-SQL scripts to query Avro data and export it to other formats (e.g., CSV) in Azure Blob storage.
    3. Verify that the data in your Avro files doesn't contain non-JSON pairs. Sometimes, non-JSON data can cause issues when reading Avro files. If you encounter non-JSON data, consider using Avro tools to convert it to a readable format or filter out problematic records.
    4. Although you mentioned not seeing any dropped or orphaned messages in the metrics browser, it's essential to continue monitoring your system. Check the IoT Hub metrics, Blob storage metrics, and any other relevant logs to identify any anomalies or patterns related to missing data.
    5. Ensure that the data synchronization process between IoT Hub, Blob storage, and your data warehouse (ClickHouse) is robust. You can consider implementing retries, error handling, and consistency checks to prevent data loss during ingestion.
    6. Finaly, if you're using multiple partitions, verify that the partitioning strategy aligns with your data flow. Try to parallelize data processing to handle large volumes efficiently.

    References

    For more reading and information. Kindly use the additional resources provided by the right side of this page and the followings:

    Source: Azure IOT Apache Avro format - Stack Overflow. Accessed, 7/22/2024.

    Source: Query Avro data by using Azure Data Lake Analytics. Accessed, 7/22/2024.

    Accept Answer

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.

    ** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.

    Best Regards,

    Sina Salam


  2. Sander van de Velde | MVP 32,551 Reputation points MVP
    2024-07-22T21:30:42.3966667+00:00

    Hello @Peter Fine ,

    welcome to this moderated Azure community forum.

    Both ways of routing ioT hub data should be reliable.

    To get more insights, you could use both Azure Data Explorer external tables and an Azure Data explorer database connection for Event Hub (that is the kafka endpoint?) to do some basic comparison in an easy way non-intrusive way.

    I expect you need to create a support ticket so the backend of your IoT hub can be investigated.


    If the response helped, do "Accept Answer". If it doesn't work, please let us know the progress. All community members with similar issues will benefit by doing so. Your contribution is highly appreciated.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.