Column class

A column in a DataFrame.

Supports Spark Connect

Syntax

Methods

Method	Description
`alias(alias, *kwargs)`	Returns this column aliased with a new name or names (in the case of expressions that return more than one column, such as explode).
`asc()`	Returns a sort expression based on the ascending order of the column.
`asc_nulls_first()`	Returns a sort expression based on ascending order of the column, and null values return before non-null values.
`asc_nulls_last()`	Returns a sort expression based on ascending order of the column, and null values appear after non-null values.
`astype(dataType)`	Alias for `cast()`.
`between(lowerBound, upperBound)`	Check if the current column's values are between the specified lower and upper bounds, inclusive.
`bitwiseAND(other)`	Compute bitwise AND of this expression with another expression.
`bitwiseOR(other)`	Compute bitwise OR of this expression with another expression.
`bitwiseXOR(other)`	Compute bitwise XOR of this expression with another expression.
`cast(dataType)`	Casts the column into type `dataType`.
`contains(other)`	Contains the other element.
`desc()`	Returns a sort expression based on the descending order of the column.
`desc_nulls_first()`	Returns a sort expression based on the descending order of the column, and null values appear before non-null values.
`desc_nulls_last()`	Returns a sort expression based on the descending order of the column, and null values appear after non-null values.
`dropFields(*fieldNames)`	An expression that drops fields in StructType by name.
`endswith(other)`	String ends with.
`eqNullSafe(other)`	Equality test that is safe for null values.
`getField(name)`	An expression that gets a field by name in a StructType.
`getItem(key)`	An expression that gets an item at position ordinal out of a list, or gets an item by key out of a dict.
`ilike(other)`	SQL ILIKE expression (case insensitive LIKE).
`isNaN()`	True if the current expression is NaN.
`isNotNull()`	True if the current expression is NOT null.
`isNull()`	True if the current expression is null.
`isin(*cols)`	A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
`like(other)`	SQL like expression.
`name(alias, *kwargs)`	Alias for `alias()`.
`otherwise(value)`	Evaluates a list of conditions and returns one of multiple possible result expressions.
`over(window)`	Define a windowing column.
`rlike(other)`	SQL RLIKE expression (LIKE with Regex).
`startswith(other)`	String starts with.
`substr(startPos, length)`	Return a Column which is a substring of the column.
`try_cast(dataType)`	This is a special version of `cast` that performs the same operation, but returns a NULL value instead of raising an error if the invoke method throws exception.
`when(condition, value)`	Evaluates a list of conditions and returns one of multiple possible result expressions.
`withField(fieldName, col)`	An expression that adds/replaces a field in StructType by name.

Operators

The Column class supports standard Python operators for arithmetic, comparison, and logical operations:

Arithmetic: +, -, *, /, %, **
Comparison: ==, !=, <, <=, >, >=
Logical: & (AND), | (OR), ~ (NOT)

Examples

For more simple examples that demonstrate usage of columns, see Column operations.

Create Column instances

Select a column from a DataFrame:

df = spark.createDataFrame(
    [(2, "Alice"), (5, "Bob")], ["age", "name"])

# Access by attribute
df.name
# Column<'name'>

# Access by bracket notation
df["name"]
# Column<'name'>

Create a column from an expression:

df.age + 1
# Column<...>

1 / df.age
# Column<...>

Basic column operations

# Arithmetic operations
df.select(df.age + 10).show()

# Comparison operations
df.filter(df.age > 3).show()

# String operations
df.filter(df.name.startswith("A")).show()

# Null checking
df.filter(df.name.isNotNull()).show()

Conditional logic

from pyspark.sql import functions as F

df.select(
    F.when(df.age < 3, "child")
     .when(df.age < 13, "kid")
     .otherwise("adult")
     .alias("age_group")
).show()

Sorting

df.orderBy(df.age.desc()).show()
df.orderBy(df.age.asc_nulls_last()).show()

Feedback

Was this page helpful?

Last updated on 2026-02-25