Increase size of ASA Parquet file output

mpoeckl 161 Reputation points Microsoft Employee
2020-09-03T16:00:19.647+00:00

Is there a way to (further) increase the size of the parquet files written by ASA output? The current settings is "minimum rows: 10000" and "maximum time: 2 hours". However the files are written by ASA every 5 seconds and the size per file is ~10MB. This generates a huge number of small parquet files which is very bad for further processing / analysis and currently requires an additional step of file consolidation.

Azure Stream Analytics
Azure Stream Analytics
An Azure real-time analytics service designed for mission-critical workloads.
333 questions
{count} votes

Accepted answer
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2020-09-03T22:23:11.567+00:00

    Hi MartinPoeckl-2046,

    I have a confirmation from product team that there is no way to increase that 10,000 rows limit or maximum time limit for now. On ASA side, we don't want to hold too much data in the batch as it could cause the node to crash and fail the job. hence there is a 10,000 rows limit for now. Since it's a hard-limit, there is no way to say allow a customer/subscription to go beyond that limit.

    In case if you have high scale of data to handle, I would recommend you to please add a feature request suggestion in Azure Stream Analytics feedback forum: https://feedback.azure.com/forums/270577-stream-analytics

    All the feedback shared in this forum are actively monitored and reviewed by ASA engineering team. Also please do share the feature request suggestion link here, so that other users with similar idea can up-vote and/or comment on your suggestion.

    Hope this info helps.

    Thank you


    Please do consider to click on "Accept Answer" and "Upvote" on the post that helps you, as it can be beneficial to other community members.


0 additional answers

Sort by: Most helpful