`from_avro` function

Applies to: check marked yes Databricks Runtime 16.0 and above

Returns a struct value with the avroBin and jsonSchemaStr.

Syntax

from_avro(avroBin, jsonSchemaStr, options )

Arguments

avroBin: A BINARY expression specifying a row of Avro data.
avroSchemaSpec: The target schema in JSON format. It must match the schema encoded in avroBin as specified in to_avro().
options: A MAP<STRING,STRING> literal specifying directives.

Returns

A STRUCT with field names and types based on the result of schema_of_json(jsonStr).

avroBin must be well-formed with respect to the avroSchemaSpec and options or Databricks raises an exception.

Options

Option	Values	Description
`mode`	`FAILFAST`, `PERMISSIVE`	Error handling mode. Default: `FAILFAST`. In `PERMISSIVE` mode, corrupt records are set to `NULL` instead of raising an error.
`compression`	`uncompressed`, `snappy`, `deflate`, `bzip2`, `xz`, `zstandard`	Compression codec for encoding Avro data.
`avroSchemaEvolutionMode`	`none`, `restart`	Schema evolution mode. Default: `none`. When set to `restart`, the query throws an `UnknownFieldException` when the schema changes. Restart the job to use the new schema. See Use schema evolution mode with from_avro.
`recursiveFieldMaxDepth`	Range: `-1` to `15`	Maximum recursion depth along a single recursive path. Default: `-1`, which does not limit recursion depth. When a shared type is reachable from many distinct schema paths, schema expansion might cause the driver to run out of memory because this option bounds depth on one path only. To workaround: Increase the driver size. Simplify the schema by replacing recursive references with leaf types. Parse the payload with `VARIANT` type. See Ingest data as semi-structured variant type.

Examples

> SELECT from_avro(to_avro(5), '{ "type" : "int" }', NULL:MAP<STRING, STRING>);
  5

> SELECT from_avro(to_avro(5, '{ "type" : "int" }'), '{ "type" : "int" }', NULL:MAP<STRING, STRING>);
  5

> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')), '{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "string"}]}', NULL:MAP<STRING, STRING>);
  {"num":5,"txt":"hello"}

> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')),
                   '{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "double"}]}',
                   map('mode', 'failfast'));
  Error: Avro data is not valid for the specified schema.

> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')),
                   '{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "double"}]}',
                   map('mode', 'permissive'));
  {"num":null,"txt":null}

from_avro function

Feedback

Was this page helpful?

Last updated on 2026-04-29