Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Databricks Runtime 16.0 and above
Returns a struct value with the avroBin and jsonSchemaStr.
Syntax
from_avro(avroBin, jsonSchemaStr, options )
Arguments
avroBin: ABINARYexpression specifying a row of Avro data.avroSchemaSpec: The target schema in JSON format. It must match the schema encoded inavroBinas specified in to_avro().options: AMAP<STRING,STRING>literal specifying directives.
Returns
A STRUCT with field names and types based on the result of schema_of_json(jsonStr).
avroBin must be well-formed with respect to the avroSchemaSpec and options or Databricks raises an exception.
Options
| Option | Values | Description |
|---|---|---|
mode |
FAILFAST, PERMISSIVE |
Error handling mode. Default: FAILFAST. In PERMISSIVE mode, corrupt records are set to NULL instead of raising an error. |
compression |
uncompressed, snappy, deflate, bzip2, xz, zstandard |
Compression codec for encoding Avro data. |
avroSchemaEvolutionMode |
none, restart |
Schema evolution mode. Default: none. When set to restart, the query throws an UnknownFieldException when the schema changes. Restart the job to use the new schema. See Use schema evolution mode with from_avro. |
recursiveFieldMaxDepth |
Range: -1 to 15 |
Maximum recursion depth along a single recursive path. Default: -1, which does not limit recursion depth.When a shared type is reachable from many distinct schema paths, schema expansion might cause the driver to run out of memory because this option bounds depth on one path only. To workaround:
|
Examples
> SELECT from_avro(to_avro(5), '{ "type" : "int" }', NULL:MAP<STRING, STRING>);
5
> SELECT from_avro(to_avro(5, '{ "type" : "int" }'), '{ "type" : "int" }', NULL:MAP<STRING, STRING>);
5
> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')), '{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "string"}]}', NULL:MAP<STRING, STRING>);
{"num":5,"txt":"hello"}
> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')),
'{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "double"}]}',
map('mode', 'failfast'));
Error: Avro data is not valid for the specified schema.
> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')),
'{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "double"}]}',
map('mode', 'permissive'));
{"num":null,"txt":null}