Share via


make_valid_utf8

Returns a new string in which all invalid UTF-8 byte sequences, if any, are replaced by the Unicode replacement character (U+FFFD).

For the corresponding Databricks SQL function, see make_valid_utf8 function.

Syntax

from pyspark.databricks.sql import functions as dbf

dbf.make_valid_utf8(str=<str>)

Parameters

Parameter Type Description
str pyspark.sql.Column or str A column of strings, each representing a UTF-8 byte sequence.

Returns

pyspark.sql.Column: the valid UTF-8 version of the given input string.

Examples

from pyspark.databricks.sql import functions as dbf
spark.range(1).select(dbf.make_valid_utf8(dbf.lit("SparkSQL"))).show()