columns in to rows to in notebook

Shambhu Rai 1,406 Reputation points
2022-07-13T17:54:09.037+00:00

Hi Expert,

I wanted to convert columns into the rows in azure db note book. The conversation table shows speed ration between 100 to 200 and 200 to 400 and out combines conversation table values

here is my source:
![220408-image.png]1

Conversation table
220551-image.png

Expected output

220542-image.png

needs to convert this using case statement or any other method and column in rows in azure

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,341 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,917 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,541 questions
Active Directory Federation Services
Active Directory Federation Services
An Active Directory technology that provides single-sign-on functionality by securely sharing digital identity and entitlement rights across security and enterprise boundaries.
1,190 questions
Microsoft Purview
Microsoft Purview
A Microsoft data governance service that helps manage and govern on-premises, multicloud, and software-as-a-service data. Previously known as Azure Purview.
932 questions
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-07-19T10:19:45.397+00:00

    Hi @Shambhu Rai ,

    Thank you for posting query in Microsoft Q&A Platform.

    You need to add a dummyColumn to SourceTable and then perform Unpivot of it and then perform Join with Conversion Table

    Kindly check below detailed implementation code & screenshots.
    222179-image.png

    from pyspark.sql.functions import lit  
    from pyspark.sql.functions import expr  
      
    schema1 = ['cycling','car']  
    data1 = [(200,400)]  
      
    schema2 = ['min','max','speed','rating']  
    data2=[(100,200,'Higher','good'),(200,400,'Highest','better')]  
      
    df1 = spark.createDataFrame(data1,schema1)  
    df2 = spark.createDataFrame(data2,schema2)  
      
    display(df1)  
    display(df2)  
      
    df1 = df1.withColumn('dummyCol',lit(1))  
    display(df1)  
      
      
    unpivotExpr = "stack(2, 'cycling', cycling, 'car', car) as (Property,Value)"  
    unPivotDF = df1.select("dummyCol", expr(unpivotExpr))  
    display(unPivotDF)  
      
      
    resultDf = unPivotDF.join(df2,unPivotDF.Value == df2.max,'inner')  
    display(resultDf.select('Property','Value','speed','rating'))  
    

    Hope this helps. Please let us know if any further queries.

    -------------

    Please consider hitting Accept Answer button. Accepted answers help community as well.


0 additional answers

Sort by: Most helpful