filter condition in databricks

Shambhu Rai 1,406 Reputation points
2022-09-21T11:27:35.59+00:00

Hi Expert,
how to get filter condition and case condition in databricks using spyspark

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,371 questions
Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
489 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,973 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,753 questions
{count} votes

Accepted answer
  1. ShaktiSingh-MSFT 13,996 Reputation points Microsoft Employee
    2022-09-21T16:44:33.153+00:00

    Hi @Shambhu Rai ,

    Thanks for posting question in Microsoft Q&A platform and for using Azure Services..

    As I understand your ask, you want to understand filter and case usage in Databricks using Pyspark.
    Please correct if my understanding does not comply.

    For illustration, in the below sample example, source data is having four columns as ID, Name, Gender, Salary.
    We are required to first Filter data for a particular person, then using Case, we need to create a new column with Gender details.

    • Filter Condition:

    Make use of filter function as shown below:

    df.filter(df.Name == "Mayra").show()  
    

    243561-image.png

    • Case condition:

    Using case when then as below:

    from pyspark.sql.functions import expr, col  
      
    df1 = df.withColumn( "GenderDetails", expr("CASE WHEN Gender = 'M' THEN 'Male' " +   
                   "WHEN Gender = 'F' THEN 'Female'" +  
                   "ELSE Gender END"))  
    

    243497-image.png

    For Filter, refer to the link pyspark-where-filter for more examples.
    For Case, refer to this link pyspark-when-otherwise for variety of examples.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification

0 additional answers

Sort by: Most helpful