The PATINDEX
function is not directly available in Databricks, which typically runs on Apache Spark. However, you can achieve similar functionality using Spark SQL functions. The PATINDEX
function in SQL Server is used to find the starting position of a pattern in a string.
In Databricks (Spark SQL), you would typically use a combination of functions like regexp_extract
and instr
to mimic the behavior of PATINDEX
.
Check also this old thread : https://stackoverflow.com/questions/58329209/patindex-in-spark-sql
from pyspark.sql import functions as F
# Example DataFrame
data = [("Hello abc world",), ("abc starts here",), ("no match here",)]
df = spark.createDataFrame(data, ["text"])
# Pattern to search for
pattern = "abc"
# Adding a new column to DataFrame with the starting position of the pattern
df = df.withColumn("pat_index", F.instr(F.regexp_extract("text", pattern, 0), pattern))
df.show()