filter condition in databricks

Question

filter condition in databricks

Shambhu Rai 1,411

Hi Expert,
how to get filter condition and case condition in databricks using spyspark

Subashri Vasudevan 11,226 Reputation points

2022-09-21T12:35:06.967+00:00

HI @Shambhu Rai ,

Could you please elaborate your question so that we get back with the necessary details?

Thanks
Shambhu Rai 1,411 Reputation points

2022-09-21T15:04:09.6+00:00

pyspark statement for filter condition

Accepted answer

0 additional answers

Your answer

Subashri Vasudevan 11,226 Reputation points

2022-09-21T12:35:06.967+00:00

HI @Shambhu Rai ,

Could you please elaborate your question so that we get back with the necessary details?

Thanks
Shambhu Rai 1,411 Reputation points

2022-09-21T15:04:09.6+00:00

pyspark statement for filter condition

Answer 1

SSingh-MSFT 16,371 Moderator

Hi @Shambhu Rai ,

Thanks for posting question in Microsoft Q&A platform and for using Azure Services..

As I understand your ask, you want to understand filter and case usage in Databricks using Pyspark.
Please correct if my understanding does not comply.

For illustration, in the below sample example, source data is having four columns as ID, Name, Gender, Salary.
We are required to first Filter data for a particular person, then using Case, we need to create a new column with Gender details.

Filter Condition:

Make use of filter function as shown below:

df.filter(df.Name == "Mayra").show()

Case condition:

Using case when then as below:

from pyspark.sql.functions import expr, col  
  
df1 = df.withColumn( "GenderDetails", expr("CASE WHEN Gender = 'M' THEN 'Male' " +   
               "WHEN Gender = 'F' THEN 'Female'" +  
               "ELSE Gender END"))

For Filter, refer to the link pyspark-where-filter for more examples.
For Case, refer to this link pyspark-when-otherwise for variety of examples.

Hope this will help. Please let us know if any further queries.

------------------------------

Please don't forget to click on or upvote button whenever the information provided helps you.
Original posters help the community find answers faster by identifying the correct answer. Here is how
Want a reminder to come back and check responses? Here is how to subscribe to a notification

Shambhu Rai 1,411 Reputation points

2022-09-21T19:17:46.25+00:00

multiple filter on dataframe level
Subashri Vasudevan 11,226 Reputation points

2022-09-22T05:19:26.123+00:00

Hi,

In the link provided by @SSingh-MSFT , you can find examples to filter data frame with multiple conditions. A sample is given below -

df.filter( (df.state == "OH") & (df.gender == "M") ) .show(truncate=False)

Let us know if you are looking for syntax of a different construct.

Thanks
Shambhu Rai 1,411 Reputation points

2022-09-22T05:45:14.3+00:00

Will it work datadrame level
Subashri Vasudevan 11,226 Reputation points

2022-09-22T05:58:06.757+00:00

Yes, it works on top of a data frame and filters based on condition

Share via

filter condition in databricks

0 additional answers

Your answer