Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type.
The position is not zero based, but 1 based index.
For the corresponding Databricks SQL function, see substring function.
Syntax
from pyspark.databricks.sql import functions as dbf
dbf.substring(str=<str>, pos=<pos>, len=<len>)
Parameters
| Parameter | Type | Description |
|---|---|---|
str |
pyspark.sql.Column or str |
target column to work on. |
pos |
pyspark.sql.Column or str or int |
starting position in str. |
len |
pyspark.sql.Column or str or int |
length of chars. |
Returns
pyspark.sql.Column: substring of given value.
Examples
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('abcd',)], ['s',])
df.select('*', dbf.substring(df.s, 1, 2)).show()
df = spark.createDataFrame([('Spark', 2, 3)], ['s', 'p', 'l'])
df.select('*', dbf.substring(df.s, 2, df.l)).show()
df.select('*', dbf.substring(df.s, df.p, 3)).show()
df.select('*', dbf.substring(df.s, df.p, df.l)).show()
df = spark.createDataFrame([('Spark', 2, 3)], ['s', 'p', 'l'])
df.select('*', dbf.substring(df.s, 2, 'l')).show()
df.select('*', dbf.substring('s', 'p', 'l')).show()