Error when using Window activity in Azure Data Factory Data Flow

Tobias Rognstad 20 Reputation points
2023-09-22T11:38:00.53+00:00

Hi,

I am currently attempting to use the window activity in a data flow to recreate partition over order by.

What I want is for the activity to rank rows with the same RECID and DML_Action by their LastProcessedChange_DateTime value, with the most recent (highest datetime) being given rank 1.

That way, I can filter on rank 1 and keep only the latest changes made.

User's image

I have setup a Window activity where I get the columns to partition over by using the byName() function.

User's image

I have also sorted by LastProcessedChange_DateTime (descending) and selected window column named "rank" with the function rank().

User's image

User's image

(Range by is set to unbounded).

However, when attempting to preview data in this step I receive the following error:

"Spark job failed: { "text/plain": "{"runId":"fd41672e-488a-4215-91f3-c11f613f2e61","sessionId":"6ba64fed-d965-4390-b9fe-a2c77fca1287","status":"Failed","payload":{"statusCode":400,"shortMessage":"DF-WIN-003 at Window 'window1'(Line 23/Col 25): Window should reference at least one column\nDF-WIN-003 at Window 'window1'(Line 24/Col 10): Window should reference at least one column\nDF-WIN-003 at Window 'window1'(Line 25/Col 10): Window should reference at least one column","detailedMessage":"Failure 2023-09-22 11:24:09.358 failed DebugManager.processJob, run=fd41672e-488a-4215-91f3-c11f613f2e61, errorMessage=DF-WIN-003 at Window 'window1'(Line 23/Col 25): Window should reference at least one column\nDF-WIN-003 at Window 'window1'(Line 24/Col 10): Window should reference at least one column\nDF-WIN-003 at Window 'window1'(Line 25/Col 10): Window should reference at least one column"}}\n" } - RunId: fd41672e-488a-4215-91f3-c11f613f2e61"

User's image

I am lost and have found no relevant questions in my own attempts to Google this, and am starting to wonder if it is an error with the data flow product itself. Any help is appreciated!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,721 questions
0 comments No comments
{count} vote

2 answers

Sort by: Most helpful
  1. Tobias Rognstad 20 Reputation points
    2023-09-25T13:06:22.1966667+00:00

    Solution:

    Added a DerivedColumn step prior to the window which used the parameters to select the columns and rename them with their original name. Then they are available for selection in the window activity and it works as intended.

    User's image

    User's image

    3 people found this answer helpful.
    0 comments No comments

  2. Amira Bedhiafi 24,786 Reputation points
    2023-09-25T10:43:43.14+00:00

    Can you verify with the rank() if it is well-defined within the window activity ? And if it is set to work along with the sort column (LastProcessedChange_DateTime).


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.