Setting StatusFolder for HDInsight Spark job triggered through Data Factory

Shweta Chandramouli 1 Reputation point
2021-03-11T21:33:49.843+00:00

Currently our Spark job runs result in a number of folders with random guid names being created in the root directory of the container we use as our HDInsight cluster storage. This seems to be the folder in the context of which the job runs, it has a copy of the script being used. Is there a way to specify a folder within which these guid folders for job state can be created?

.NET
.NET
Microsoft Technologies based on the .NET software framework.
3,344 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
197 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,476 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 76,511 Reputation points Microsoft Employee
    2021-03-19T06:32:21.277+00:00

    Hello @Shweta Chandramouli ,

    Unfortunately, you cannot specific a Status Folder for HDInsight Spark Job trigged through Data Factory.

    Reason: Azure Data Factory creates [auto-generated-GUID] once HDInsight Spark job completes.

    I would suggest you to provide feedback on the same:

    https://feedback.azure.com/forums/270578-data-factory

    All of the feedback you share in these forums will be monitored and reviewed by the Microsoft engineering teams responsible for building Azure.

    Hope this helps. Do let us know if you any further queries.

    ------------

    Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.