question

Sugiartokwan-8887 avatar image
0 Votes"
Sugiartokwan-8887 asked Sugiartokwan-8887 commented

ADF, dataflow - sort in expression language


I have an input string that was delimited by white space. In the expression builder I try using expression language to split it. Splitting works fine as array output [] but issues come out when input string is not a sequence number (red box). So, I need to sort the array value but I don't know how to use it.

The ideal condition is

1 2 3 4
1 3 4
1 4
2 4
4

but sometimes input string is not sequential number.

1 4 2
1 2 4 3
3 4 1 2
anyone can help me how to use sort the array value ?
What am I confuse is

Sort expects a reference to two consecutive elements in the expression function as #item1 and #item2.

What I expect the output looks like these below:

input | output
1 4 2 | 1 2 4
1 2 4 3 | 1 2 3 4
3 4 1 2 | 1 2 3 4



anyone can give an example for me ?

199026-sort.jpg


azure-data-factory
sort.jpg (100.8 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

KranthiPakala-MSFT avatar image
1 Vote"
KranthiPakala-MSFT answered Sugiartokwan-8887 commented

Hello @Sugiartokwan-8887,

Thanks for the question and using MS Q&A platform.

Requirement:
What I expect the output looks like these below:
input | output
1 4 2 | 1 2 4
1 2 4 3 | 1 2 3 4
3 4 1 2 | 1 2 3 4


In order to achieve the above requirement, you can use below expression in your expression builder. This expression does split your input into an array and then sort the order and then replace the unwanted characters as needed.

 replace((replace((replace((toString(sort(split(input, ' '), compare(#item1, #item2)))), '","', ' ')),'["','')),'"]','')   -> Please note that `input` is the source column name in this sample

Below is the output for the expression:

199817-image.png

Hope this will help. Let us know if any further queries.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators



image.png (38.3 KiB)
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.


Thank you for your effort for helping me.
Your solution works well but I got extra input with NULL value that makes your solution get error.
This error appear if the input has NULL and produce error like this below

Error:
Job aborted due to stage failure: Task 0 in stage 5.0 failed 1 times, most recent failure: Lost task 0.0 in stage 5.0 (TID 5, vm-30e14019, executor 1): java.lang.NullPointerException
at org.apache.spark.sql.extensions.ArrayOrder.eval(FunctionExtensions.scala:1605)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:282)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)

So, how to modify your solution when the input get NULL value ? Thxs.

0 Votes 0 ·

Eventually, I can handle 'Null' value with 'Case'.

0 Votes 0 ·