Get max value from a parquet file by using ADF

Question

Get max value from a parquet file by using ADF

Zhu, Yueli YZ [NC] 280

Hi, I am able to use ADF data flow to read a parquet file and get the row numbers of this parquet file. Following is the data flow. User's image

But when I followed this https://stackoverflow.com/questions/74813503/how-to-take-max-value-and-replace-it-in-drived-colomn-in-data-factoryand tried to get max value of one column, it fail. As long as I added the window, I could not even do Data Preview from source. Here is the error message: Spark job failed: { "text/plain": "{"runId":"f128afcc-c3af-4bb7-a0a9-0544303c577b","sessionId":"0a1d08df-99ce-496f-b437-f7e098422c65","status":"Failed","payload":{"statusCode":400,"shortMessage":"DF-EXPR-010 at Window 'window1'(Line 16/Col 35): Column 'Hello' used in expression is unavailable or invalid.","detailedMessage":"Failure 2024-01-24 18:34:29.554 failed DebugManager.processJob, run=f128afcc-c3af-4bb7-a0a9-0544303c577b, errorMessage=DF-EXPR-010 at Window 'window1'(Line 16/Col 35): Column 'Hello' used in expression is unavailable or invalid."}}\n" } - RunId: f128afcc-c3af-4bb7-a0a9-0544303c577b Do you have any suggestions on how to get max value from a parquet file ? Thanks

Answer accepted by question author

0 additional answers

Your answer

Answer 1

Subashri Vasudevan 11,306 Volunteer Moderator

Hello,

Looks like you havent imported the schema in source parquet file. Once you import schema, you will see number of columns in source transformation. Currently it doesnt show any column info, which means that the schema is not imported.

Go to the source data set and import schema from file. And then add a window transformation to find max key.

Please try this and let us know for any further query on it. Thank you.

Zhu, Yueli YZ [NC] 280 Reputation points

2024-01-25T20:24:27.3+00:00

Thanks for your answer. I tried the Import projection from source

, but got the following error
Subashri Vasudevan 11,306 Reputation points Volunteer Moderator

2024-01-26T05:43:33.02+00:00

Alternatively you can try to import schema from the data set. That should make it work from data flow source as well
Smaran Thoomu 32,535 Reputation points Microsoft External Staff Moderator

2024-01-29T04:01:50.55+00:00

@Zhu, Yueli YZ [NC] We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.

Share via

Get max value from a parquet file by using ADF

0 additional answers

Your answer