Data types
Article 07/29/2024
7 contributors
Feedback
In this article
Supported data types
Data type classification
Language mappings
Related articles
Applies to: Databricks SQL Databricks Runtime
For rules governing how conflicts between data types are resolved, see SQL data type rules .
Azure Databricks supports the following data types:
Expand table
Data Type
Description
BIGINT
Represents 8-byte signed integer numbers.
BINARY
Represents byte sequence values.
BOOLEAN
Represents Boolean values.
DATE
Represents values comprising values of fields year, month and day, without a time-zone.
DECIMAL(p,s)
Represents numbers with maximum precision p
and fixed scale s
.
DOUBLE
Represents 8-byte double-precision floating point numbers.
FLOAT
Represents 4-byte single-precision floating point numbers.
INT
Represents 4-byte signed integer numbers.
INTERVAL intervalQualifier
Represents intervals of time either on a scale of seconds or months.
VOID
Represents the untyped NULL.
SMALLINT
Represents 2-byte signed integer numbers.
STRING
Represents character string values.
TIMESTAMP
Represents values comprising values of fields year, month, day, hour, minute, and second, with the session local timezone.
TIMESTAMP_NTZ
Represents values comprising values of fields year, month, day, hour, minute, and second. All operations are performed without taking any time zone into account.
TINYINT
Represents 1-byte signed integer numbers.
ARRAY < elementType >
Represents values comprising a sequence of elements with the type of elementType
.
MAP < keyType,valueType >
Represents values comprising a set of key-value pairs.
STRUCT < [fieldName : fieldType [NOT NULL][COMMENT str][, …]] >
Represents values with the structure described by a sequence of fields.
VARIANT
Represents semi-structured data.
OBJECT
Represents values in a VARIANT
with the structure described by a set of fields.
Important
Delta Lake does not support the VOID
type.
Data types are grouped into the following classes:
Integral numeric types represent whole numbers:
Exact numeric types represent base-10 numbers:
Binary floating point types use exponents and a binary representation to cover a large range of numbers:
Numeric types represents all numeric data types:
Date-time types represent date and time components:
Simple types are types defined by holding singleton values:
Complex types are composed of multiple components of complex or simple types :
Applies to: Databricks Runtime
Spark SQL data types are defined in the package org.apache.spark.sql.types
. You access them by importing the package:
import org.apache.spark.sql.types._
Expand table
SQL type
Data type
Value type
API to access or create data type
TINYINT
ByteType
Byte
ByteType
SMALLINT
ShortType
Short
ShortType
INT
IntegerType
Int
IntegerType
BIGINT
LongType
Long
LongType
FLOAT
FloatType
Float
FloatType
DOUBLE
DoubleType
Double
DoubleType
DECIMAL(p,s)
DecimalType
java.math.BigDecimal
DecimalType
STRING
StringType
String
StringType
BINARY
BinaryType
Array[Byte]
BinaryType
BOOLEAN
BooleanType
Boolean
BooleanType
TIMESTAMP
TimestampType
java.sql.Timestamp
TimestampType
TIMESTAMP_NTZ
TimestampNTZType
java.time.LocalDateTime
TimestampNTZType
DATE
DateType
java.sql.Date
DateType
year-month interval
YearMonthIntervalType
java.time.Period
YearMonthIntervalType (3)
day-time interval
DayTimeIntervalType
java.time.Duration
DayTimeIntervalType (3)
ARRAY
ArrayType
scala.collection.Seq
ArrayType(elementType [, containsNull]). (2)
MAP
MapType
scala.collection.Map
MapType(keyType, valueType [, valueContainsNull]). (2)
STRUCT
StructType
org.apache.spark.sql.Row
StructType(fields). fields is a Seq of StructField. 4 .
StructField
The value type of the data type of this field(For example, Int for a StructField with the data type IntegerType)
StructField(name, dataType [, nullable]). 4
VARIANT
VariantType
org.apache.spark.unsafe.type.VariantVal
VariantType
OBJECT
Not Supported
Not supported
Not supported
Spark SQL data types are defined in the package org.apache.spark.sql.types
. To access or create a data type, use factory methods provided in org.apache.spark.sql.types.DataTypes
.
Expand table
SQL type
Data Type
Value type
API to access or create data type
TINYINT
ByteType
byte or Byte
DataTypes.ByteType
SMALLINT
ShortType
short or Short
DataTypes.ShortType
INT
IntegerType
int or Integer
DataTypes.IntegerType
BIGINT
LongType
long or Long
DataTypes.LongType
FLOAT
FloatType
float or Float
DataTypes.FloatType
DOUBLE
DoubleType
double or Double
DataTypes.DoubleType
DECIMAL(p,s)
DecimalType
java.math.BigDecimal
DataTypes.createDecimalType() DataTypes.createDecimalType(precision, scale).
STRING
StringType
String
DataTypes.StringType
BINARY
BinaryType
byte[]
DataTypes.BinaryType
BOOLEAN
BooleanType
boolean or Boolean
DataTypes.BooleanType
TIMESTAMP
TimestampType
java.sql.Timestamp
DataTypes.TimestampType
TIMESTAMP_NTZ
TimestampNTZType
java.time.LocalDateTime
DataTypes.TimestampNTZType
DATE
DateType
java.sql.Date
DataTypes.DateType
year-month interval
YearMonthIntervalType
java.time.Period
YearMonthIntervalType (3)
day-time interval
DayTimeIntervalType
java.time.Duration
DayTimeIntervalType (3)
ARRAY
ArrayType
ava.util.List
DataTypes.createArrayType(elementType [, containsNull]).(2)
MAP
MapType
java.util.Map
DataTypes.createMapType(keyType, valueType [, valueContainsNull]).(2)
STRUCT
StructType
org.apache.spark.sql.Row
DataTypes.createStructType(fields). fields is a List or array of StructField. 4
StructField
The value type of the data type of this field (For example, int for a StructField with the data type IntegerType)
DataTypes.createStructField(name, dataType, nullable) 4
VARIANT
VariantType
org.apache.spark.unsafe.type.VariantVal
VariantType
OBJECT
Not Supported
Not supported
Not supported
Spark SQL data types are defined in the package pyspark.sql.types
. You access them by importing the package:
from pyspark.sql.types import *
Expand table
SQL type
Data type
Value type
API to access or create data type
TINYINT
ByteType
int or long. (1)
ByteType()
SMALLINT
ShortType
int or long. (1)
ShortType()
INT
IntegerType
int or long
IntegerType()
BIGINT
LongType
long (1)
LongType()
FLOAT
FloatType
float (1)
FloatType()
DOUBLE
DoubleType
float
DoubleType()
DECIMAL(p,s)
DecimalType
decimal.Decimal
DecimalType()
STRING
StringType
string
StringType()
BINARY
BinaryType
bytearray
BinaryType()
BOOLEAN
BooleanType
bool
BooleanType()
TIMESTAMP
TimestampType
datetime.datetime
TimestampType()
TIMESTAMP_NTZ
TimestampNTZType
datetime.datetime
TimestampNTZType()
DATE
DateType
datetime.date
DateType()
year-month interval
YearMonthIntervalType
Not supported
Not supported
day-time interval
DayTimeIntervalType
datetime.timedelta
DayTimeIntervalType (3)
ARRAY
ArrayType
list, tuple, or array
ArrayType(elementType, [containsNull]).(2)
MAP
MapType
dict
MapType(keyType, valueType, [valueContainsNull]).(2)
STRUCT
StructType
list or tuple
StructType(fields). field is a Seq of StructField. (4)
StructField
The value type of the data type of this field (For example, Int for a StructField with the data type IntegerType)
StructField(name, dataType, [nullable]).(4)
VARIANT
VariantType
VariantVal
VariantType()
OBJECT
Not Supported
Not supported
Not supported
Expand table
SQL type
Data type
Value type
API to access or create data type
TINYINT
ByteType
integer (1)
‘byte’
SMALLINT
ShortType
integer (1)
‘short’
INT
IntegerType
integer
‘integer’
BIGINT
LongType
integer (1)
‘long’
FLOAT
FloatType
numeric (1)
‘float’
DOUBLE
DoubleType
numeric
‘double’
DECIMAL(p,s)
DecimalType
Not supported
Not supported
STRING
StringType
character
‘string’
BINARY
BinaryType
raw
‘binary’
BOOLEAN
BooleanType
logical
‘bool’
TIMESTAMP
TimestampType
POSIXct
‘timestamp’
TIMESTAMP_NTZ
TimestampNTZType
datetime.datetime
TimestampNTZType()
DATE
DateType
Date
‘date’
year-month interval
YearMonthIntervalType
Not supported
Not supported
day-time interval
DayTimeIntervalType
Not supported
Not supported
ARRAY
ArrayType
vector or list
list(type=’array’, elementType=elementType, containsNull=[containsNull]).(2)
MAP
MapType
environment
list(type=’map’, keyType=keyType, valueType=valueType, valueContainsNull=[valueContainsNull]).(2)
STRUCT
StructType
named list
list(type=’struct’, fields=fields). fields is a Seq of StructField. (4)
StructField
The value type of the data type of this field (For example, integer for a StructField with the data type IntegerType)
list(name=name, type=dataType, nullable=[nullable]).(4)
VARIANT
Not Supported
Not supported
Not supported
OBJECT
Not Supported
Not supported
Not supported
(1) Numbers are converted to the domain at runtime. Make sure that numbers are within range.
(2) The optional value defaults to TRUE
.
(3) Interval types
YearMonthIntervalType([startField,] endField)
: Represents a year-month interval which is made up of a contiguous subset of the following fields:
startField
is the leftmost field, and endField
is the rightmost field of the type.
Valid values of startField
and endField
are 0(MONTH)
and 1(YEAR)
.
DayTimeIntervalType([startField,] endField)
: Represents a day-time interval which is made up of a contiguous subset of the following fields:
startField
is the leftmost field, and endField
is the rightmost field of the type.
Valid values of startField
and endField
are 0(DAY)
, 1(HOUR)
, 2(MINUTE)
, 3(SECOND)
.
(4) StructType
StructType(fields)
Represents values with the structure described by a sequence, list, or array of StructField
s (fields).
Two fields with the same name are not allowed.
StructField(name, dataType, nullable)
Represents a field in a StructType
.
The name of a field is indicated by name
.
The data type of a field is indicated by dataType.
nullable
indicates if values of these fields can have null
values. This is the default.