question

JamesJCClark-9738 avatar image
0 Votes"
JamesJCClark-9738 asked SamaraSoucy-MSFT commented

Stream Analytics Output to BlobStorage Folder Structure and Time

We have a stream analytics job that is simply processing json files and passing them into cosmos or to another event hub. On the event hub we have capture events to make avro files for later consumption. We are having an issue there and have thought to use stream analytics and blob storage output and simply make the json files ourselves. However we are running into a problem/concern and thought to bring here before a ticket is made.

Path Pattern: {date}/{time}/asa-piiCleaner/{MachineId}
Date Format: YYYY/MM/DD
Time Format: HH/mm
Format: Array
Minimum rows: 1
Maximum time: 1 minute

Given this setup I was expecting to see in blob storage something akin to the following
2021/03/31/16/00/asa-piiCleaner/00dc7a2a-dbe0-4b08-b939-acf073cb88ba/jsonfile
2021/03/31/16/01/asa-piiCleaner/00dc7a2a-dbe0-4b08-b939-acf073cb88ba/jsonfile
2021/03/31/16/02/asa-piiCleaner/00dc7a2a-dbe0-4b08-b939-acf073cb88ba/jsonfile
2021/03/31/16/03/asa-piiCleaner/00dc7a2a-dbe0-4b08-b939-acf073cb88ba/jsonfile

I have yet to run this for a full hour however at this time not seeing what we are expecting instead
2021/03/31/16/00/asa-piiCleaner/00dc7a2a-dbe0-4b08-b939-acf073cb88ba/jsonfile

is being created and that same file is being updated for now going on 10 minutes. Everything else feels like its working as designed except that the "mm" column is not changing

I did this right at the bottom of an hour so was able to see it change from 03/31/15 to 03/31/16. However the folder "00" seems to stay constant

Also anyone know how to just take the stream and make individual files?

Thoughts?

azure-stream-analytics
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

SamaraSoucy-MSFT avatar image
0 Votes"
SamaraSoucy-MSFT answered SamaraSoucy-MSFT commented

Currently only hours are supported in the default time format, producing the result you are seeing. However, setting your path pattern to {date}/{datetime:HH}/{datetime:mm}/asa-piiCleaner/{MachineId} should create the results you are expecting.


· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I will have to come back and try this we actively moved to another solution as this was definitely not going to work as desired because even if it created the files every minute it seems to stream them in. In that it makes the files and then keeps on updating it. With our integration with event grid and a new file can be picked up but to reprocess changes in files causes all files to have to be rescanned. Yeah an upgrade could be made to rescan only the changed files and reprocess however the third party vendor does not provide that at this time.

Thanks for the help if/when we come down this path I will try this again.

0 Votes 0 ·

If you do decide to come back to Stream Analytics, you may want to look at tumbling window functions. They let you group results and the only get written out at the end of the window- if your window is one minute it will only write once a minute.

0 Votes 0 ·
JS-Azure avatar image
0 Votes"
JS-Azure answered JamesJCClark-9738 commented

Hi, can you confirm you have outputs going at the minute level (e.g. if you have a 10 minute tumbling windows, you will have output only every 10 minutes).
Also what is your out-of-order and late arrival settings in the settings? Out-of-order add some time before the first output, and late arrival events may adjust the events to another folder. You can also see if you have late events in the metrics.
Thanks,
JS (Stream Analytics)

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

This is what the blob storage output gives from stream analytics
84085-image.png
and here for the ordering
84069-image.png


0 Votes 0 ·
image.png (48.3 KiB)
image.png (39.3 KiB)