Azure IoT - Query Data from IoT Files

Sarosh Niazi 21 Reputation points
2020-06-15T00:51:10.89+00:00

Hello,

I am using Azure (Azure Databricks, IoT Hub) to stream unstructured data from IoT devices (i.e. wind turbine), in the form of thousands of files with millions of data captured over a period of 10 years. How do I extract a variety of metadata fields directly from these unstructured files? (and not from a structured table, for example)

The reason for this that these devices are generating metadata fields such as temperature and humidity data most of the time, however a particular device may be generating new metadata fields, which I may not be aware of. I would like to know this beforehand, so that I can address this issue prior to it becoming problematic.

Particularly, I would like to see: file name (i.e. windTurbine14), metadata field names (i.e. temperature, humidity, newMetadataFieldX), and metadata field data type (i.e. double, double, double). Once I have this information, I can conduct analytics on this data to better visualize the new metadata fields from each file.

I would really appreciate any help that you can provide in this matter. Specifically, what queries should I be running on these files, to ensure there is 100% extraction of all metadata fields from all files?

Thanks in advance!

Azure IoT
Azure IoT
A category of Azure services for internet of things devices.
383 questions
Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
484 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,942 questions
0 comments No comments
{count} votes

Accepted answer
  1. Uri Barash 176 Reputation points
    2020-06-15T09:14:25.527+00:00

    Hi Sarosh,

    You can definitely achieve this with Azure Data Explorer. Will reach out to you to have an in depth discussion, and see how we can assist.

    Uri

    2 people found this answer helpful.
    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Sander van de Velde 28,956 Reputation points MVP
    2020-06-15T08:48:00.673+00:00

    Thinking out-of-the-box, could an Azure function or logic app be a solution which inspects the columns from incoming files with a previous file and check if there are any differences?

    At least, this makes sure you are not missing new or altered columns.

    3 people found this answer helpful.
    0 comments No comments

  2. Sarosh Niazi 21 Reputation points
    2020-06-15T19:34:50.373+00:00

    Hi Sander and Uri,

    Thank you so much for your responses. I want to proceed with both of your answers; however, my Plan A would be to go ahead with Uri’s answer. Once Plan A is thoroughly reviewed, I will then try out Sander’s directions, as Plan B.

    Uri, I have responded to your note on Linkedin, to set-up a conference call. Sander, please feel free to join the discussion, as well.

    A good example for this scenario, is that a device may be overheating, and generates an alert for this, which the Client did not initially let us know of as IoT developers. I will bring a screenshot of a sample file from the device that I am using (which, by the way, is not confidential information) to our discussion. Again, there are millions of files like these, which can signal various issues with the device, and we need to design a robust data streaming solution that can offer seamless “predictive maintenance” on the data being streamed from the device.

    Best,
    Sarosh

    0 comments No comments