Data Factory triggered from bob storage creating multiple runs

Louise Kirkham 0 Reputation points
2024-06-20T01:00:20.3033333+00:00

I have an ADF that currently triggers from a blob storage event, blob created, ignoring empty blobs.
Screenshot 2024-06-20 at 01.52.41

Previously this was correctly triggering a single ADF run when a new file was uploaded.

I have recently introduced a Logic App which copies a new file from a Sharepoint to blob storage. This works successfully, writing a single file to blob storage, however this is now triggering ten runs of the ADF pipeline.

Any advice on why this is occurring?
And any suggested solution for preventing multiple repeat runs triggered by the same file?

Currently defining my trigger with:

trigger={
    "properties": {
        "pipelines": [
            {
                "pipelineReference": {
                    "referenceName": df_pipeline_name,
                    "type": "PipelineReference"
                },
                "parameters": {
                    "filename": "@triggerBody().fileName"
                }
            }
        ],
        "type": "BlobEventsTrigger",
        "typeProperties": {
            "blobPathBeginsWith": "/data/blobs/raw/spa_initial_drafts/included",
            "blobPathEndsWith": ".docx",
            "ignoreEmptyBlobs": True,
            "scope": f"/subscriptions/{df_config['subscription_id']}/resourceGroups/{df_config['resource_group_name'].lower()}/providers/Microsoft.Storage/storageAccounts/{storage_account_name.lower()}",
            "events": [
                "Microsoft.Storage.BlobCreated"
            ]
        },
    }
}
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,576 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,988 questions
Azure Event Grid
Azure Event Grid
An Azure event routing service designed for high availability, consistent performance, and dynamic scale.
341 questions
SharePoint
SharePoint
A group of Microsoft Products and technologies used for sharing and managing content, knowledge, and applications.
10,125 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Q&A Assist
    2024-06-20T01:00:20.35+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    The reason why the Logic App is triggering ten runs of the ADF pipeline is because the Logic App is writing the file to blob storage in chunks, which triggers the ADF pipeline each time a chunk is written. To prevent multiple repeat runs triggered by the same file, you can use the "BlobCreated" event instead of the "BlobCreated (ignore empty blobs)" event in the ADF trigger. This will trigger the ADF pipeline only once when the entire file is uploaded to blob storage.

    References:


  2. Nandan Hegde 30,716 Reputation points MVP
    2024-07-01T06:02:54.42+00:00

    As you have blob trigger enabled, you can set up the parameters to capture the file name and start as stated below :

    https://www.c-sharpcorner.com/article/get-file-details-triggering-the-azure-data-factory-pipeline-synapse-via-storag/

    to debug.

    This would help in debugging the time and file name and based on that we can identify what might be the cause for the same? And take actions accordingly

    0 comments No comments