Incremental Dynamic Data Flow with CDM dataset - Different results between Data flow debug and pipeline

D1SM4L 1 Reputation point
2022-11-08T11:51:36.27+00:00

Hello,
I have attempted to create a dynamic incremental load of data using the Common Data Model inline datasets as source. It works in Data flow debug but fails to return data when ran in a pipeline.

258294-image.png
with the debug parameters below:

258283-image.png

The debug returns data:
258253-image.png

Same data flow but a pipeline to simulate real implementation:
258296-image.png

Results:
258264-image.png

Same parameters, same data flow, different results. Has anyone experienced this and able to offer help/advice?

The reason for this implementation is due to the Data Export Service (DES) end-of-life, and we have switched to Azure Synapse Link. The Data Flow activity replaces the Copy Data activity (with an Azure SQL database as source) an requires no further changes to our existing ETL processes.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,051 questions
{count} votes

2 answers

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,106 Reputation points
    2022-11-09T22:55:50.66+00:00

    Hello @D1SM4L , I think I found the real cause.

    After some more consideration, I noticed in your screenshot you have both pipeline expressions and data flow expressions, as denoted by the 2 different icons on the right of each value.

    pipeline-vs-flow

    This is important to note, because they have different syntax -- different rules on how they should be written. For example, quotation. Data Flow requires single quotes around string while pipeline expression does not require quotes around string except when a function is involved ( abcd is okay, but @concat(ab,cd) needs to be @concat('ab','cd') ) . See below example of 2 parameters, both string without quotes.

    258893-image.png

    Since the pipeline allowed without quotes, this must mean that 'cdm_worker' became " 'cdm_worker' ". Err, that didn't explain well. The single quotes got considered as part of the string, instead of delimiters.

    Firstly, please try making all of them the same type. All Data Flow expression type is probably best.

    Now, this manifested as difference between Data Flow debug and pipeline run, because this choice of pipeine-expression-vs-dataflow-expression doesn't happen until you put a Data Flow activity in a pipeline.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

  2. D1SM4L 1 Reputation point
    2022-11-10T10:34:11.543+00:00

    Thank you for the response @MartinJaffer-MSFT , I have identified the issue as 'Enable source change data capture' box ticked in the Source options tab of the Data Flow source dataset.

    From your reply I did a 1st test and it copied data across. 2nd test did not copy data across, which triggered a light bulb moment.

    259010-enable-change-data-capture.png


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.