Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Returns a new string in which all invalid UTF-8 byte sequences, if any, are replaced by the Unicode replacement character (U+FFFD).
For the corresponding Databricks SQL function, see make_valid_utf8 function.
Syntax
from pyspark.databricks.sql import functions as dbf
dbf.make_valid_utf8(str=<str>)
Parameters
| Parameter | Type | Description |
|---|---|---|
str |
pyspark.sql.Column or str |
A column of strings, each representing a UTF-8 byte sequence. |
Returns
pyspark.sql.Column: the valid UTF-8 version of the given input string.
Examples
from pyspark.databricks.sql import functions as dbf
spark.range(1).select(dbf.make_valid_utf8(dbf.lit("SparkSQL"))).show()