encode function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

Returns the binary representation of a string using the charSet character encoding.

Syntax

encode(expr, charSet)

Arguments

  • expr: A STRING expression to be encoded.
  • charSet: A STRING expression specifying the encoding.

Returns

A BINARY.

The following character set encodings are supported (case-insensitive):

  • 'US-ASCII': Seven-bit ASCII, ISO646-US.
  • 'ISO-8859-1': ISO Latin Alphabet No. 1, ISO-LATIN-1.
  • 'UTF-8': Eight-bit UCS Transformation Format.
  • 'UTF-16BE': Sixteen-bit UCS Transformation Format, big-endian byte order.
  • 'UTF-16LE': Sixteen-bit UCS Transformation Format, little-endian byte order.
  • 'UTF-16': Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark.

Examples

-- Wrap encode in hex to normalize UI dependent BINARY output.
> SELECT hex(encode('Spark SQL', 'UTF-16'));
 FEFF0053007000610072006B002000530051004C

> SELECT hex(encode('Spark SQL', 'US-ASCII'));
537061726B2053514C

> SELECT decode(X'FEFF0053007000610072006B002000530051004C', 'UTF-16')
 Spark SQL