Copy Dataverse data into Azure SQL using Synapse Link not running the DaterverseToSQL step

Greenwood, Justin -Administrator 75 Reputation points
2023-04-03T14:57:32.3+00:00

I have managed to create a Azure Synapse Link for Dataverse and this is all working fine.

In the Azure Synapse workspace I have connected to my source storage account and to the data base.

I have set up a trigger in fact I have done everything it says to do in https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-pipelines?tabs=synapse-analytics.

But I have noticed it is sometime missing the step DaterverseToSQL. Is there any reason for this. I have checked the folders and they have data in them but for some reason it is not pulling the data over.

User's image

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
{count} vote

Accepted answer
  1. KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator
    2023-04-13T21:59:05.9+00:00

    Ray Chung , Greenwood, Justin -Administrator :

    A record was created and included in incremental folder 1. The same record was later edited and included in incremental folder 2. Processing of incremental folder 1 failed, while processing of incremental folder 2 was successful. After a rerun, processing of incremental folder 2 was still successful. Which values of the record are current? Those from incremental folder 1 or 2

    When Orchestrator Pipeline encountered unexpected error and failed for the given folder (Incremental folder 1). ThenDataverseToSQLPipelineProcessingLog has entry of 0 (failure) for the failed folder. Then subsequent folders (eg: Incremental folder 2) will be marked as 3 (skipped) and not as successful (1). As the previous folder are not successful (status code <> 1), hence all the subsequent folders will be skipped (Status code = 3). To avoid this skipping folder loop, please follow below resolution steps:  ​

    1. Identify & resolve the root cause of pipeline failure. ​
    2. Manually run the DataverseToSQL pipeline for the failed folder.​
    3. After successful execution of manual run, update the corresponding row in DataverseToSQLPipelineProcessingLog to 1 (success)​.
    4. Chronologically, sequentially & manually process the subsequent skipped (3) folders and manually update the DataverseToSQLPipelineProcessingLog Status column of corresponding rows to 1 (success).
    5. Once the failed folder & all the skipped folders are marked as successful, DataverseToSQL_Orchestrator pipeline will automatically process the next folder in next trigger. You may have question that there are several skipped folders and manually executing them to updating DataverseToSQLPipelineProcessingLog is time consuming and error prone. Is there any better approach?​

    Recommendation:

    Yes, in that case to avoid manual effort, please consider having new pipeline which: ​

    1. Retrieves the skipped folder chronologically​.
    2. Executes the DataverseToSQL pipeline sequentially​.
    3. Updates Status column of DataverseToSQLPipelineProcessingLog rows to 1 (success). Below is sample view of the pipeline looks like for processing skipped folders: User's image User's image User's image User's image User's image User's image

    Greenwood, Justin -Administrator - From your earlier response I see that 594 folders were status <>1 . Since their status is <> 1, all subsequent folders will be skipped and will not execute the main pipeline Execute DataverseToSQLPipeline . In order to overcome the issue, as I described above, please try to identify the reason why the initial folder are being skipped and reprocess them in chronological order and update the log status to 1 for those folders and once all the backlog is cleared, the subsequent folders will be copied automatically by the main Orchestrator pipeline.

    "firstRow": {
    
    		"cnt": 594
    
    	},
    

    Important Note: If the pipeline getting skipped without any failure, then it means that prior pipeline runs for previous folder are failed (status = 0) or skipped (status = 3) due to pipeline execution time overlapped as shown in below image.

    To avoid the execution time overlapping, as I mentioned in my previous posts, it is necessary to set the concurrency = 1 for your orchestrator pipeline, that way only one execution will be in progress until completed and the subsequent pipeline runs will be queued and executed in the chronological order. Ensuring that Concurrency setting of Orchestrator pipeline is 1 will avoid this scenario in future. I will also provide feedback to the document/template owner to update the template in such a way that by default concurrency is set to 1 from the template gallery.
    User's image

    I hope this explains the scenarios why the folders will be skipped and how that impacts the subsequent folder runs and how to reprocess those skipped folder runs.

    In case if this is still not clear, and you are blocked I would recommend you to please log a support ticket so that a support engineer can schedule a call and go through your pipeline history and will suggest the next steps to fix the problem. Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    2 people found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator
    2023-04-11T22:06:02.6133333+00:00

    @Greenwood, Justin -Administrator Thanks for your response and additional details.

    When I looked at the query of CheckPreviousRuns It is looking for count FROM dbo.[DataverseToSQLPipelineProcessingLog] where Status <> 1 AND (Container = 'dataverse-############-orgd#######' AND Folder <> '2023-04-04T10.40.08Z' )

    And as per the condition defined in the If Previous Runs Unsuccessful activity, if the CheckPreviousRuns returns count > 0 then if will not trigger Execute DataverseToSQLPipeline activity.

    Condition:

    @greater(activity('CheckPreviousRuns').output.firstRow.cnt,0)
    
    

    Execute DataverseToSQLPipeline activity will be executed only when cnt = 0 from CheckPreviousRuns activity.

    In your case, as per the CheckPreviousRuns output, the value of cnt = 88 (which means you have 88 unsuccessful runs) which is why your If Previous Runs Unsuccessful condition is always true and skips the Execute DataverseToSQLPipeline activity executions as it enters the true path of the IF condition evaluation.

    {
    	"firstRow": {
    		"cnt": 88
    	},
    	"effectiveIntegrationRuntime": "AutoResolveIntegrationRuntime (UK South)",
    	"billingReference": {
    		"activityType": "PipelineActivity",
    		"billableDuration": [
    			{
    				"meterType": "AzureIR",
    				"duration": 0.016666666666666666,
    				"unit": "DIUHours"
    			}
    		]
    	},
    	"durationInQueue": {
    		"integrationRuntimeQueue": 0
    	}
    }
    

    As called out by Ray Chung, when previous pipeline runStatus = 1 then the Execute DataverseToSQLPipeline will be triggered.

    Could you please set your pipeline concurrency to 1 and see if that helps to resolve the issue. This setting will ensure that the previous pipeline run is completed before the next pipeline concurrent run is starts the execution, until then the latest pipeline runs will be Queued. Please note that there is a limit here, max pipeline queued size is 100 and if goes beyond that then the subsequent runs will be failed. User's image

    User's image

    User's image

    Hope this clarifies why Execute DataverseToSQLPipeline is skipped in your case. Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

  2. Ray Chung 0 Reputation points
    2023-04-07T15:38:24.3333333+00:00

    I have te same issue. I figured out that the DataverseToSQL pipeline will only be triggered when the check in the DataverseToSQL_Orchestrator is passed (Status code of the previous runs must be equal to 1). In our situation the previous run is still processing while the new run has been triggered. The status of the new run and the runs after will be set to the code 3 which means the mentioned check always fails and the DataverseToSQL will not be triggered again. Do you already have a solution for this issue?

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.