Built-in functions
Applies to: Databricks SQL
Databricks Runtime
This article presents links to and descriptions of built-in operators and functions for strings and binary types, numeric scalars, aggregations, windows, arrays, maps, dates and timestamps, casting, CSV data, JSON data, XPath manipulation, and other miscellaneous functions.
Also see:
Operators and predicates
For information on how operators are parsed with respect to each other, see Operator precedence.
Operator | Syntax | Description |
---|---|---|
& | expr1 & expr2 |
Returns the bitwise AND of expr1 and expr2 . |
and | expr1 and expr2 |
Returns the logical AND of expr1 and expr2 . |
* | multiplier * multiplicand |
Returns multiplier multiplied by multiplicand . |
!= | expr1 != expr2 |
Returns true if expr1 does not equal expr2 , or false otherwise. |
! | !expr |
Returns the logical NOT of a Boolean expression. |
between | expr1 [not] between expr2 and expr2 |
Tests whether expr1 is greater or equal than expr2 and less than or equal to expr3 . |
[ ] | arrayExpr [ indexExpr ] |
Returns indexExpr nd element of ARRAY arrayExpr |
[ ] | mapExpr [ keyExpr ] |
Returns value at keyExpr of MAP mapExpr |
^ | expr1 ^ expr2 |
Returns the bitwise exclusive OR (XOR) of expr1 and expr2 . |
: | jsonStr : jsonPath |
Returns fields extracted from the jsonStr . |
:: | expr :: type |
Casts the value expr to the target data type type . |
div | divisor div dividend |
Returns the integral part of the division of divisor by dividend . |
. | mapExpr . keyIdentifier |
Returns a MAP value by keyIdentifier . |
. | structExpr . fieldIdentifier |
Returns a STRUCT field by fieldIdentifier . |
== | expr1 == expr2 |
Returns true if expr1 equals expr2 , or false otherwise. |
= | expr1 = expr2 |
Returns true if expr1 equals expr2 , or false otherwise. |
>= | expr1 >= expr2 |
Returns true if expr1 is greater than or equal to expr2 , or false otherwise. |
> | expr1 > expr2 |
Returns true if expr1 is greater than expr2 , or false otherwise. |
exists | exists(query) |
Returns true if query returns at least one row, or false otherwise. |
ilike | str [not] ilike (pattern[ESCAPE escape]) |
Returns true if str does (not) match pattern with escape case-insensitively. |
ilike | str [not] ilike {ANY|SOME|ALL}([pattern[, ...]]) |
Returns true if str does (not) match any/all patterns case-insensitively. |
in | elem [not] in (expr1[, ...]) |
Returns true if elem does (not) equal any exprN . |
in | elem [not] in (query) |
Returns true if elem does (not) equal any row in query . |
is distinct | expr1 is [not] distinct from expr2 |
Tests whether the arguments do (not) have different values where NULLs are considered as comparable values. |
is false | expr is [not] false |
Tests whether expr is (not) false . |
is null | expr is [not] null |
Returns true if expr is (not) NULL . |
is true | expr is [not] true |
Tests whether expr is (not) true . |
like | str [not] like (pattern[ESCAPE escape]) |
Returns true if str does (not) match pattern with escape . |
like | str [not] like {ANY|SOME|ALL}([pattern[, ...]]) |
Returns true if str does (not) match any/all patterns. |
<=> | expr1 <=> expr2 |
Returns the same result as the EQUAL(=) for non-null operands, but returns true if both are NULL , false if one of the them is NULL . |
<= | expr1 <= expr2 |
Returns true if expr1 is less than or equal to expr2 , or false otherwise. |
<> | expr1 <> expr2 |
Returns true if expr1 does not equal expr2 , or false otherwise. |
< | expr1 < expr2 |
Returns true if expr1 is less than expr2 , or false otherwise. |
- | expr1 - expr2 |
Returns the subtraction of expr2 from expr1 . |
not | not expr |
Returns the logical NOT of a Boolean expression. |
or | expr1 or expr2 |
Returns the logical OR of expr1 and expr2 . |
% | dividend % divisor |
Returns the remainder after dividend / divisor . |
|| | expr1 || expr2 |
Returns the concatenation of expr1 and expr2 . |
| | expr1 | expr2 |
Returns the bitwise OR of expr1 and expr2 . |
+ | expr1 + expr2 |
Returns the sum of expr1 and expr2 . |
regexp | str [not] regexp regex |
Returns true if str does (not) match regex . |
regexp_like | str [not] regexp_like regex |
Returns true if str does (not) match regex . |
rlike | str [not] rlike regex |
Returns true if str does (not) match regex . |
/ | dividend / divisor |
Returns dividend divided by divisor . |
~ | ~ expr |
Returns the bitwise NOT of expr . |
Operator precedence
Precedence | Operator |
---|---|
1 | : , :: , [ ] |
2 | - (unary), + (unary), ~ |
3 | * , / , % , div |
4 | + , - , || |
5 | & |
6 | ^ |
7 | | |
8 | = , == , <=> , <> , != , < , <= , > , >= |
9 | not , exists |
10 | between , in , rlike , regexp , ilike , like , is [not] [NULL, true, false] , is [not] distinct from |
11 | and |
12 | or |
String and binary functions
Function | Description |
---|---|
expr1 || expr2 | Returns the concatenation of expr1 and expr2 . |
aes_decrypt(expr, key[, mode[, padding[, aad]]]) | Decrypts a binary expr using AES encryption. |
aes_encrypt(expr, key[, mode[, padding[, iv[, aad]]]]) | Encrypts a binary expr using AES encryption. |
ascii(str) | Returns the ASCII code point of the first character of str . |
base64(expr) | Converts expr to a base 64 string. |
bin(expr) | Returns the binary representation of expr . |
binary(expr) | Casts the value of expr to BINARY. |
bit_length(expr) | Returns the bit length of string data or number of bits of binary data. |
bitmap_count(expr) | Returns the number of bits set in a BINARY string representing a bitmap. |
btrim(str [, trimStr]) | Returns str with leading and trailing characters removed. |
char(expr) | Returns the character at the supplied UTF-16 code point. |
char_length(expr) | Returns the character length of string data or number of bytes of binary data. |
character_length(expr) | Returns the character length of string data or number of bytes of binary data. |
charindex(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos . |
chr(expr) | Returns the character at the supplied UTF-16 code point. |
concat(expr1, expr2[, …]) | Returns the concatenation of the arguments. |
concat_ws(sep[, expr1[, …]]) | Returns the concatenation strings separated by sep . |
contains(expr, subExpr) | Returns true if expr STRING or BINARY contains subExpr . |
crc32(expr) | Returns a cyclic redundancy check value of expr . |
decode(expr, charSet) | Translates binary expr to a string using the character set encoding charSet . |
encode(expr, charSet) | Returns the binary representation of a string using the charSet character encoding. |
endswith(expr, endExpr) | Returns true if expr STRING or BINARY ends with endExpr . |
find_in_set(searchExpr, sourceExpr) | Returns the position of a string within a comma-separated list of strings. |
format_number(expr, scale) | Formats expr like #,###,###.## , rounded to scale decimal places. |
format_number(expr, fmt) | Formats expr like fmt . |
format_string(strfmt[, obj1 [, …]]) | Returns a formatted string from printf-style format strings. |
hex(expr) | Converts expr to hexadecimal. |
str ilike (pattern[ESCAPE escape]) | Returns true if str matches pattern with escape case insensitively. |
initcap(expr) | Returns expr with the first letter of each word in uppercase. |
instr(str, substr) | Returns the (1-based) index of the first occurrence of substr in str . |
lcase(expr) | Returns expr with all characters changed to lowercase. |
left(str, len) | Returns the leftmost len characters from str . |
len(expr) | Returns the character length of string data or number of bytes of binary data. |
length(expr) | Returns the character length of string data or number of bytes of binary data. |
levenshtein(str1, str2) | Returns the Levenshtein distance between the strings str1 and str2 . |
str like (pattern[ESCAPE escape]) | Returns true if str matches pattern with escape . |
locate(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos . |
lower(expr) | Returns expr with all characters changed to lowercase. |
lpad(expr, len[, pad]) | Returns expr , left-padded with pad to a length of len . |
ltrim([trimstr,] str) | Returns str with leading characters within trimStr removed. |
mask(str[, upperChar[, lowerChar[, digitChar[, otherChar]]]]) | Returns a masked version of the input str . |
md5(expr) | Returns an MD5 128-bit checksum of expr as a hex string. |
octet_length(expr) | Returns the byte length of string data or number of bytes of binary data. |
overlay(input PLACING replace FROM pos [FOR len]) | Replaces input with replace that starts at pos and is of length len . |
parse_url(url, partToExtract[, key]) | Extracts a part from url . |
position(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos . |
position(subtr IN str) | Returns the position of the first occurrence of substr in str after position pos . |
printf(strfmt[, obj1 [, …]]) | Returns a formatted string from printf-style format strings. |
str regexp regex | Returns true if str matches regex . |
str regexp_like regex | Returns true if str matches regex . |
regexp_count(str, regexp) | Returns the number of times str matches the regexp pattern. |
regexp_extract(str, regexp[, idx]) | Extracts the first string in str that matches the regexp expression and corresponds to the regex group index. |
regexp_extract_all(str, regexp[, idx]) | Extracts the all strings in str that matches the regexp expression and corresponds to the regex group index. |
regexp_instr(str, regexp) | Returns the position of the first substring in str that matches regexp . |
regexp_replace(str, regexp, rep[, position]) | Replaces all substrings of str that match regexp with rep . |
regexp_substr(str, regexp) | Returns the first substring in str that matches regexp . |
repeat(expr, n) | Returns the string that repeats expr n times. |
replace(str, search [, replace]) | Replaces all occurrences of search with replace . |
reverse(expr) | Returns a reversed string or an array with reverse order of elements. |
right(str, len) | Returns the rightmost len characters from the string str . |
str rlike regex | Returns true if str matches regex . |
rpad(expr, len[, pad]) | Returns expr , right-padded with pad to a length of len . |
rtrim([trimStr,] str) | Returns str with trailing characters removed. |
sentences(str[, lang, country]) | Splits str into an array of array of words. |
sha(expr) | Returns a sha1 hash value as a hex string of expr . |
sha1(expr) | Returns a sha1 hash value as a hex string of expr . |
sha2(expr, bitLength) | Returns a checksum of the SHA-2 family as a hex string of expr . |
soundex(expr) | Returns the soundex code of the string. |
space(n) | Returns a string consisting of n spaces. |
split(str, regex[, limit]) | Splits str around occurrences that match regex and returns an array with a length of at most limit . |
split_part(str, delim, partNum) | Splits str around occurrences of delim and returns the partNum part. |
startswith(expr, startExpr) | Returns true if expr STRING or BINARY starts with startExpr . |
string(expr) | Casts the value expr to STRING. |
substr(expr, pos[, len]) | Returns the substring of expr that starts at pos and is of length len . |
substr(expr FROM pos[ FOR len]) | Returns the substring of expr that starts at pos and is of length len . |
substring(expr, pos[, len]) | Returns the substring of expr that starts at pos and is of length len . |
substring(expr FROM pos[ FOR len]) | Returns the substring of expr that starts at pos and is of length len . |
substring_index(expr, delim, count) | Returns the substring of expr before count occurrences of the delimiter delim . |
to_binary(expr[, fmt]) | Returns expr cast to a Binary based on fmt . |
to_char(numExpr, fmt) | Returns numExpr cast to STRING using formatting fmt .” |
to_varchar(numExpr, fmt) | Returns numExpr cast to STRING using formatting fmt .” |
translate(expr, from, to) | Returns an expr where all characters in from have been replaced with those in to . |
trim([[BOTH | LEADING | TRAILING] [trimStr] FROM] str) | Trim characters from a string. |
try_aes_decrypt(expr, key[, mode[, padding[, aad]]]) | Decrypts a binary expr using AES encryption, and return NULL in case of error. |
try_to_binary(expr [, fmt]) | Returns expr cast to BINARY based on fmt , or NULL if the input is invalid. |
ucase(expr) | Returns expr with all characters changed to uppercase. |
unbase64(expr) | Returns a decoded base64 string as binary. |
unhex(expr) | Converts hexadecimal expr to BINARY. |
upper(expr) | Returns expr with all characters changed to uppercase. |
url_decode(str) | Translates a string back from application/x-www-form-urlencoded format. |
url_encode(str) | Translates a string into application/x-www-form-urlencoded format. |
Numeric scalar functions
Function | Description |
---|---|
~ expr | Returns the bitwise NOT of expr . |
dividend / divisor | Returns dividend divided by divisor . |
expr1 | expr2 | Returns the bitwise OR of expr1 and expr2 . |
- expr | Returns the negated value of expr . |
expr1 - expr2 | Returns the subtraction of expr2 from expr1 . |
+ expr | Returns the value of expr . |
expr1 + expr2 | Returns the sum of expr1 and expr2 . |
dividend % divisor | Returns the remainder after dividend / divisor . |
expr1 ^ expr2 | Returns the bitwise exclusive OR (XOR) of expr1 and expr2 . |
expr1 & expr2 | Returns the bitwise AND of expr1 and expr2 . |
multiplier * multiplicand | Returns multiplier multiplied by multiplicand . |
abs(expr) | Returns the absolute value of the numeric value in expr . |
acos(expr) | Returns the inverse cosine (arccosine) of expr . |
acosh(expr) | Returns the inverse hyperbolic cosine of expr . |
asin(expr) | Returns the inverse sine (arcsine) of expr . |
asinh(expr) | Returns the inverse hyperbolic sine of expr . |
atan(expr) | Returns the inverse tangent (arctangent) of expr . |
atan2(exprY, exprX) | Returns the angle in radians between the positive x-axis of a plane and the point specified by the coordinates (exprX , exprY ). |
atanh(expr) | Returns inverse hyperbolic tangent of expr . |
bigint(expr) | Casts the value expr to BIGINT. |
bit_count(expr) | Returns the number of bits set in the argument. |
bit_get(expr, pos) | Returns the value of a bit in a binary representation of an integral numeric. |
bit_reverse(expr) | Returns the value obtained by reversing the order of the bits in the argument. |
bitmap_bit_position(expr) | Returns the 0-based bit position of a given BIGINT number within a bucket. |
bitmap_bucket_number(expr) | Returns the bitmap bucket number for a given BIGINT number. |
bround(expr[,targetScale]) | Returns the rounded expr using HALF_EVEN rounding mode. |
cbrt(expr) | Returns the cube root of expr . |
ceil(expr[,targetScale]) | Returns the smallest number not smaller than expr rounded up to targetScale digits relative to the decimal point. |
ceiling(expr[,targetScale]) | Returns the smallest number not smaller than expr rounded up to targetScale digits relative to the decimal point. |
conv(num, fromBase, toBase) | Converts num from fromBase to toBase . |
convert_timezone([sourceTz, ]targetTz, sourceTs) | Converts the TIMESTAMP_NTZ sourceTs from the sourceTz time zone to targetTz . |
cos(expr) | Returns the cosine of expr . |
cosh(expr) | Returns the hyperbolic cosine of expr . |
cot(expr) | Returns the cotangent of expr . |
csc(expr) | Returns the cosecant of expr . |
decimal(expr) | Casts the value expr to DECIMAL. |
degrees(expr) | Converts radians to degrees. |
divisor div dividend | Returns the integral part of the division of divisor by dividend . |
double(expr) | Casts the value expr to DOUBLE. |
e() | Returns the constant e . |
exp(expr) | Returns e to the power of expr . |
expm1(expr) | Returns exp(expr) - 1 . |
factorial(expr) | Returns the factorial of expr . |
float(expr) | Casts the value expr to FLOAT. |
floor(expr[,targetScale]) | Returns the largest number not smaller than expr rounded down to targetScale digits relative to the decimal point. |
getbit(expr, pos) | Returns the value of a bit in a binary representation of an integral numeric. |
hypot(expr1, expr2) | Returns sqrt(expr1 * expr1 + expr2 * expr2) . |
int(expr) | Casts the value expr to INTEGER. |
isnan(expr) | Returns true if expr is NaN . |
ln(expr) | Returns the natural logarithm (base e ) of expr . |
log([base,] expr) | Returns the logarithm of expr with base . |
log1p(expr) | Returns log(1 + expr) . |
log2(expr) | Returns the logarithm of expr with base 2 . |
log10(expr) | Returns the logarithm of expr with base 10 . |
mod(dividend, divisor) | Returns the remainder after dividend / divisor . |
nanvl(expr1, expr2) | Returns expr1 if it’s not NaN , or expr2 otherwise. |
negative(expr) | Returns the negated value of expr . |
pi() | Returns pi. |
pmod(dividend, divisor) | Returns the positive remainder after dividend / divisor . |
positive(expr) | Returns the value of expr . |
pow(expr1, expr2) | Raises expr1 to the power of expr2 . |
power(expr1, expr2) | Raises expr1 to the power of expr2 . |
radians(expr) | Converts expr in degrees to radians. |
rand([seed]) | Returns a random value between 0 and 1. |
randn([seed]) | Returns a random value from a standard normal distribution. |
random([seed]) | Returns a random value between 0 and 1. |
rint(expr) | Returns expr rounded to a whole number as a DOUBLE. |
round(expr[,targetScale]) | Returns the rounded expr using HALF_UP rounding mode. |
sec(expr) | Returns the secant of expr . |
sin(expr) | Returns the sine of expr . |
shiftleft(expr, n) | Returns a bitwise left shifted by n bits. |
shiftright(expr, n) | Returns a bitwise signed signed integral number right shifted by n bits. |
shiftrightunsigned(expr, n) | Returns a bitwise unsigned signed integral number right shifted by n bits. |
sign(expr) | Returns -1.0, 0.0, or 1.0 as expr is negative, 0, or positive. |
signum(expr) | Returns -1.0, 0.0, or 1.0 as expr is negative, 0, or positive. |
sinh(expr) | Returns the hyperbolic sine of expr . |
smallint(expr) | Casts the value expr to SMALLINT. |
sqrt(expr) | Returns the square root of expr . |
tan(expr) | Returns the tangent of expr . |
tanh(expr) | Returns the hyperbolic tangent of expr . |
tinyint(expr) | Casts expr to TINYINT. |
to_number(expr, fmt ) | Returns expr cast to DECIMAL using formatting fmt . |
try_add(expr1, expr2) | Returns the sum of expr1 and expr2 , or NULL in case of error. |
try_divide(dividend, divisor) | Returns dividend divided by divisor , or NULL if divisor is 0. |
try_multiply(multiplier, multiplicand) | Returns multiplier multiplied by multiplicand , or NULL on overflow. |
try_subtract(expr1, expr2) | Returns the subtraction of expr2 from expr1 , or NULL on overflow. |
try_to_number(expr, fmt ) | Returns expr cast to DECIMAL using formatting fmt , or NULL if expr does not match the format. |
width_bucket(expr, minExpr, maxExpr, numBuckets) | Returns the bucket number for a value in an equi-width histogram. |
Aggregate functions
Function | Description |
---|---|
any(expr) | Returns true if at least one value of expr in the group is true. |
any_value(expr[,ignoreNull]) | Returns any random value of expr for a group of rows. |
approx_count_distinct(expr[,relativeSD]) | Returns the estimated number of distinct values in expr within the group. |
approx_percentile(expr,percentage[,accuracy]) | Returns the approximate percentile of the expr within the group. |
approx_top_k(expr[,k[,maxItemsTracked]]) | Returns the top k most frequently occurring item values in an expr along with their approximate counts. |
array_agg(expr) | Returns an array consisting of all values in expr within the group. |
avg(expr) | Returns the mean calculated from values of a group. |
bit_and(expr) | Returns the bitwise AND of all input values in the group. |
bit_or(expr) | Returns the bitwise OR of all input values in the group. |
bit_xor(expr) | Returns the bitwise XOR of all input values in the group. |
bitmap_construct_agg(expr) | Returns the bitwise OR of all bit position values in the group. between 0 and 32767 in a group as a BINARY . |
bitmap_or_agg(expr) | Returns the bitwise OR of all BINARY input values in the group. |
bool_and(expr) | Returns true if all values in expr are true within the group. |
bool_or(expr) | Returns true if at least one value in expr is true within the group. |
collect_list(expr) | Returns an array consisting of all values in expr within the group. |
collect_set(expr) | Returns an array consisting of all unique values in expr within the group. |
corr(expr1,expr2) | Returns Pearson coefficient of correlation between a group of number pairs. |
count(*) | Returns the total number of retrieved rows in a group, including rows containing null. |
count(expr[, …]) | Returns the number of rows in a group for which the supplied expressions are all non-null. |
count_if(expr) | Returns the number of true values for the group in expr . |
count_min_sketch(column,epsilon,confidence,seed) | Returns a count-min sketch of all values in the group in column with the epsilon , confidence and seed . |
covar_pop(expr1,expr2) | Returns the population covariance of number pairs in a group. |
covar_samp(expr1,expr2) | Returns the sample covariance of number pairs in a group. |
every(expr) | Returns true if all values of expr in the group are true. |
first(expr[,ignoreNull]) | Returns the first value of expr for a group of rows. |
first_value(expr[,ignoreNull]) | Returns the first value of expr for a group of rows. |
hll_sketch_agg(expr[,lgConfigK]) | Returns a HyperLogLog sketch used to approximate a distinct values count. |
hll_union_agg(expr[,allowDifferentLgConfigK]) | Aggregates HyperLogLog sketches for a group of rows. |
kurtosis(expr) | Returns the kurtosis value calculated from values of a group. |
last(expr[,ignoreNull]) | Returns the last value of expr for the group of rows. |
last_value(expr[,ignoreNull]) | Returns the last value of expr for the group of rows. |
max(expr) | Returns the maximum value of expr in a group. |
max_by(expr1,expr2) | Returns the value of an expr1 associated with the maximum value of expr2 in a group. |
mean(expr) | Returns the mean calculated from values of a group. |
median(expr) | Returns the median calculated from values of a group. |
min(expr) | Returns the minimum value of expr in a group. |
min_by(expr1, expr2) | Returns the value of an expr1 associated with the minimum value of expr2 in a group. |
mode(expr [,deterministic]) | Returns the most frequent, not NULL , value of expr in a group. |
percentile(expr, percentage [,frequency]) | Returns the exact percentile value of expr at the specified percentage . |
percentile_approx(expr,percentage[,accuracy]) | Returns the approximate percentile of the expr within the group. |
percentile_cont(pct) WITHIN GROUP (ORDER BY key) | Returns the interpolated percentile of the key within the group. |
percentile_disc(pct) WITHIN GROUP (ORDER BY key) | Returns the discrete percentile of the key within the group. |
regr_avgx(yExpr, xExpr) | Returns the mean of xExpr calculated from values of a group where xExpr and yExpr are NOT NULL. |
regr_avgy(yExpr, xExpr) | Returns the mean of yExpr calculated from values of a group where xExpr and yExpr are NOT NULL. |
regr_count(yExpr, xExpr) | Returns the number of non-null value pairs yExpr , xExpr in the group. |
regr_intercept(yExpr, xExpr) | Returns the intercept of the uni-variate linear regression line in a group where xExpr and yExpr are NOT NULL. |
regr_r2(yExpr, xExpr) | Returns the coefficient of determination from values of a group where xExpr and yExpr are NOT NULL. |
regr_slope(yExpr, xExpr) | Returns the slope of the linear regression line of non-null value pairs yExpr , xExpr in the group. |
regr_sxx(yExpr, xExpr) | Returns the sum of squares of the xExpr values of a group where xExpr and yExpr are NOT NULL. |
regr_sxy(yExpr, xExpr) | Returns the sum of products of yExpr and xExpr calculated from values of a group where xExpr and yExpr are NOT NULL. |
regr_syy(yExpr, xExpr) | Returns the sum of squares of the yExpr values of a group where xExpr and yExpr are NOT NULL. |
skewness(expr) | Returns the skewness value calculated from values of a group. |
some(expr) | Returns true if at least one value of expr in a group is true . |
std(expr) | Returns the sample standard deviation calculated from the values within the group. |
stddev(expr) | Returns the sample standard deviation calculated from the values within the group. |
stddev_pop(expr) | Returns the population standard deviation calculated from values of a group. |
stddev_samp(expr) | Returns the sample standard deviation calculated from values of a group. |
sum(expr) | Returns the sum calculated from values of a group. |
try_avg(expr) | Returns the mean calculated from values of a group, NULL if there is an overflow. |
try_sum(expr) | Returns the sum calculated from values of a group, NULL if there is an overflow. |
var_pop(expr) | Returns the population variance calculated from values of a group. |
var_samp(expr) | Returns the sample variance calculated from values of a group. |
variance(expr) | Returns the sample variance calculated from values of a group. |
Ranking window functions
Function | Description |
---|---|
dense_rank() | Returns the rank of a value compared to all values in the partition. |
ntile(n) | Divides the rows for each window partition into n buckets ranging from 1 to at most n . |
percent_rank() | Computes the percentage ranking of a value within the partition. |
rank() | Returns the rank of a value compared to all values in the partition. |
row_number() | Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition. |
Analytic window functions
Function | Description |
---|---|
cume_dist() | Returns the position of a value relative to all values in the partition. |
lag(expr[,offset[,default]]) | Returns the value of expr from a preceding row within the partition. |
lead(expr[,offset[,default]]) | Returns the value of expr from a subsequent row within the partition. |
nth_value(expr, offset[, ignoreNulls]) | Returns the value of expr at a specific offset in the window. |
Array functions
Function | Description |
---|---|
arrayExpr[indexExpr] | Returns element at position indexExpr of ARRAY arrayExpr . |
aggregate(expr,start,merge[,finish]) | Aggregates elements in an array using a custom aggregator. |
array([expr [, …]]) | Returns an array with the elements in expr . |
array_append(array, elem) | Returns array appended by elem . |
array_compact(array) | Removes NULL values from array . |
array_contains(array,value) | Returns true if array contains value . |
array_distinct(array) | Removes duplicate values from array . |
array_except(array1,array2) | Returns an array of the elements in array1 but not in array2 . |
array_insert(array, index, elem) | Returns an expanded array where elem is inserted at the index position. |
array_intersect(array1,array2) | Returns an array of the elements in the intersection of array1 and array2 . |
array_join(array,delimiter[,nullReplacement]) | Concatenates the elements of array . |
array_max(array) | Returns the maximum value in array . |
array_min(array) | Returns the minimum value in array . |
array_position(array,element) | Returns the position of the first occurrence of element in array . |
array_prepend(array, elem) | Returns array prepended by elem . |
array_remove(array,element) | Removes all occurrences of element from array . |
array_repeat(element,count) | Returns an array containing element count times. |
array_size(array) | Returns the number of elements in array . |
array_sort(array,func) | Returns array sorted according to func . |
array_union(array1,array2) | Returns an array of the elements in the union of array1 and array2 without duplicates. |
arrays_overlap(array1, array2) | Returns true if the intersection of array1 and array2 is not empty. |
arrays_zip(array1 [, …]) | Returns a merged array of structs in which the nth struct contains all Nth values of input arrays. |
cardinality(expr) | Returns the size of expr . |
concat(expr1, expr2 [, …]) | Returns the concatenation of the arguments. |
element_at(arrayExpr, index) | Returns the element of an arrayExpr at index . |
exists(expr, pred) | Returns true if pred is true for any element in expr . |
explode(collection) | Returns rows by un-nesting collection . |
explode_outer(collection) | Returns rows by un-nesting collection using outer semantics. |
filter(expr,func) | Filters the array in expr using the function func . |
flatten(arrayOfArrays) | Transforms an array of arrays into a single array. |
forall(expr, predFunc) | Tests whether predFunc holds for all elements in the array. |
get(arrayExpr, index) | Returns the element of an arrayExpr at index , starting at 0 . |
inline(expr) | Explodes an array of structs into a table. |
inline_outer(expr) | Explodes an array of structs into a table with outer semantics. |
posexplode(expr) | Returns rows by un-nesting the array with numbering of positions. |
posexplode_outer(expr) | Returns rows by un-nesting the array with numbering of positions using OUTER semantics. |
reduce(expr,start,merge[,finish]) | Aggregates elements in an array using a custom aggregator. |
reverse(array) | Returns a reversed string or an array with reverse order of elements. |
sequence(start,stop,step) | Generates an array of elements from start to stop (inclusive), incrementing by step . |
shuffle(array) | Returns a random permutation of the array in expr . |
size(expr) | Returns the cardinality of expr . |
slice(expr,start,length) | Returns a subset of an array. |
sort_array(expr[,ascendingOrder]) | Returns the array in expr in sorted order. |
transform(expr, func) | Transforms elements in an array in expr using the function func . |
try_element_at(arrayExpr, index) | Returns the element of an arrayExpr at index , or NULL if index is out of bound. |
zip_with(expr1, expr2, func) | Merges the arrays in expr1 and expr2 , element-wise, into a single array using func . |
Map functions
Function | Description |
---|---|
mapExpr[keyExpr] | Returns value at keyExpr of MAP mapExpr . |
cardinality(expr) | Returns the size of expr . |
element_at(mapExpr, key) | Returns the value of mapExpr for key . |
explode(expr) | Returns rows by un-nesting expr . |
explode_outer(expr) | Returns rows by un-nesting expr using outer semantics. |
map([{key1, value1}[, …]]) | Creates a map with the specified key-value pairs. |
map_concat([expr1 [, …]]) | Returns the union of all expr map expressions. |
map_contains_key(map, key) | Returns true if map contains key , false otherwise. |
map_entries(map) | Returns an unordered array of all entries in map . |
map_filter(expr, func) | Filters entries in the map in expr using the function func . |
map_from_arrays(keys, values) | Creates a map with a pair of the keys and values arrays. |
map_from_entries(expr) | Creates a map created from the specified array of entries. |
map_keys(map) | Returns an unordered array containing the keys of map . |
map_values(map) | Returns an unordered array containing the values of map . |
map_zip_with(map1, map2, func) | Merges map1 and map2 into a single map. |
size(expr) | Returns the cardinality of expr . |
str_to_map(expr[,pairDelim[,keyValueDelim]]) | Returns a map after splitting expr into key-value pairs using delimiters. |
transform_keys(expr, func) | Transforms keys in a map in expr using the function func . |
transform_values(expr, func) | Transforms values in a map in expr using the function func . |
try_element_at(mapExpr, key) | Returns the value of mapExpr for key , or NULL if key does not exist. |
Date, timestamp, and interval functions
For information on date and timestamp formats, see Datetime patterns.
Function | Description |
---|---|
intervalExpr / divisor | Returns interval divided by divisor . |
- intervalExpr | Returns the negated value of intervalExpr . |
intervalExpr1 - intervalExpr2 | Returns the subtraction of intervalExpr2 from intervalExpr1 . |
datetimeExpr1 - datetimeExpr2 | Returns the subtraction of datetimeExpr2 from datetimeExpr1 . |
+ intervalExpr | Returns the value of intervalExpr . |
intervalExpr1 + intervalExpr2 | Returns the sum of intervalExpr1 and intervalExpr2 . |
intervalExpr * multiplicand | Returns intervalExpr multiplied by multiplicand . |
abs(expr) | Returns the absolute value of the interval value in expr . |
add_months(startDate,numMonths) | Returns the date that is numMonths after startDate . |
curdate() | Returns the current date at the start of query evaluation. |
current_date() | Returns the current date at the start of query evaluation. |
current_timestamp() | Returns the current timestamp at the start of query evaluation. |
current_timezone() | Returns the current session local timezone. |
date(expr) | Casts the value expr to DATE. |
date_add(startDate,numDays) | Returns the date numDays after startDate . |
date_add(unit, value, expr) | Adds value unit s to a timestamp expr . |
date_diff(unit, start, stop) | Returns the difference between two timestamps measured in unit s. |
date_format(expr,fmt) | Converts a timestamp to a string in the format fmt . |
date_from_unix_date(days) | Creates a date from the number of days since 1970-01-01 . |
date_part(field,expr) | Extracts a part of the date, timestamp, or interval. |
date_sub(startDate,numDays) | Returns the date numDays before startDate . |
date_trunc(unit,expr) | Returns timestamp truncated to the unit specified in unit . |
dateadd(startDate,numDays) | Returns the date numDays after startDate . |
dateadd(unit, value, expr) | Adds value unit s to a timestamp expr . |
datediff(endDate,startDate) | Returns the number of days from startDate to endDate . |
datediff(unit, start, stop) | Returns the difference between two timestamps measured in unit s. |
day(expr) | Returns the day of month of the date or timestamp. |
dayofmonth(expr) | Returns the day of month of the date or timestamp. |
dayofweek(expr) | Returns the day of week of the date or timestamp. |
dayofyear(expr) | Returns the day of year of the date or timestamp. |
divisor div dividend | Returns the integral part of the division of interval divisor by interval dividend . |
extract(field FROM source) | Returns field of source . |
from_unixtime(unixTime,fmt) | Returns unixTime in fmt . |
from_utc_timestamp(expr,timezone) | Returns a timestamp in expr specified in UTC in the timezone timeZone . |
hour(expr) | Returns the hour component of a timestamp. |
last_day(expr) | Returns the last day of the month that the date belongs to. |
make_date(year,month,day) | Creates a date from year , month , and day fields. |
make_dt_interval([days[, hours[, mins[, secs]]]]) | Creates an day-time interval from days , hours , mins and secs . |
make_interval(years, months, weeks, days, hours, mins, secs) | Deprecated: Creates an interval from years , months , weeks , days , hours , mins and secs . |
make_timestamp(year,month,day,hour,min,sec[,timezone]) | Creates a timestamp from year , month , day , hour , min , sec , and timezone fields. |
make_ym_interval([years[, months]]) | Creates a year-month interval from years , and months . |
minute(expr) | Returns the minute component of the timestamp in expr . |
month(expr) | Returns the month component of the timestamp in expr . |
months_between(expr1,expr2[,roundOff]) | Returns the number of months elapsed between dates or timestamps in expr1 and expr2 . |
next_day(expr,dayOfWeek) | Returns the first date which is later than expr and named as in dayOfWeek . |
now() | Returns the current timestamp at the start of query evaluation. |
quarter(expr) | Returns the quarter of the year for expr in the range 1 to 4. |
second(expr) | Returns the second component of the timestamp in expr . |
session_window(expr, gpDuration) | Creates a session-window over a timestamp expression. |
sign(expr) | Returns -1.0, 0.0, or 1.0 as interval expr is negative, 0, or positive. |
signum(expr) | Returns -1.0, 0.0, or 1.0 as interval expr is negative, 0, or positive. |
timediff(unit, start, stop) | Returns the difference between two timestamps measured in unit s. |
timestamp(expr) | Casts expr to TIMESTAMP. |
timestamp_micros(expr) | Creates a timestamp expr microseconds since UTC epoch. |
timestamp_millis(expr) | Creates a timestamp expr milliseconds since UTC epoch. |
timestamp_seconds(expr) | Creates timestamp expr seconds since UTC epoch. |
timestampadd(unit, value, expr) | Adds value unit s to a timestamp expr . |
timestampdiff(unit, start, stop) | Returns the difference between two timestamps measured in unit s. |
to_date(expr[,fmt]) | Returns expr cast to a date using an optional formatting. |
to_timestamp(expr[,fmt]) | Returns expr cast to a timestamp using an optional formatting. |
to_unix_timestamp(expr[,fmt]) | Returns the timestamp in expr as a UNIX timestamp. |
to_utc_timestamp(expr,timezone) | Returns the timestamp in expr in a different timezone as UTC. |
trunc(expr, fmt) | Returns a date with the a portion of the date truncated to the unit specified by the format model fmt . |
try_add(expr1, expr2) | Returns the sum of expr1 and expr2 , or NULL in case of error. |
try_divide(dividend, divisor) | Returns dividend divided by divisor , or NULL if divisor is 0. |
try_multiply(multiplier, multiplicand) | Returns multiplier multiplied by multiplicand , or NULL on overflow. |
try_subtract(expr1, expr2) | Returns the subtraction of expr2 from expr1 , or NULL on overflow. |
try_to_timestamp(expr[,fmt]) | Returns expr cast to a timestamp using an optional formatting, or NULL if the cast fails. |
unix_date(expr) | Returns the number of days since 1970-01-01 . |
unix_micros(expr) | Returns the number of microseconds since 1970-01-01 00:00:00 UTC . |
unix_millis(expr) | Returns the number of milliseconds since 1970-01-01 00:00:00 UTC . |
unix_seconds(expr) | Returns the number of seconds since 1970-01-01 00:00:00 UTC . |
unix_timestamp([expr[, fmt]]) | eturns the UNIX timestamp of current or specified time. |
weekday(expr) | Returns the day of the week of expr . |
weekofyear(expr) | Returns the week of the year of expr . |
year(expr) | Returns the year component of expr . |
window(expr, width[, step[, start]]) | Creates a hopping based sliding-window over a timestamp expression. |
window_time(window) | Returns the inclusive end time of a sliding-window produced by the window or session_window functions. |
H3 geospatial functions
For information about H3 geospatial functions, see H3 geospatial functions.
Cast functions and constructors
For information on casting between types, see cast function and try_cast function.
Function | Description |
---|---|
array([expr [, …]]) | Returns an array with the elements in expr . |
bigint(expr) | Casts the value expr to BIGINT . |
binary(expr) | Casts the value of expr to BINARY . |
boolean(expr) | Casts expr to BOOLEAN . |
cast(expr AS type) | Casts the value expr to the target data type type . |
expr :: type | Casts the value expr to the target data type type . |
date(expr) | Casts the value expr to DATE . |
decimal(expr) | Casts the value expr to DECIMAL . |
double(expr) | Casts the value expr to DOUBLE . |
float(expr) | Casts the value expr to FLOAT . |
int(expr) | Casts the value expr to INTEGER . |
make_date(year,month,day) | Creates a date from year , month , and day fields. |
make_dt_interval([days[, hours[, mins[, secs]]]]) | Creates an day-time interval from days , hours , mins and secs . |
make_interval(years, months, weeks, days, hours, mins, secs) | Creates an interval from years , months , weeks , days , hours , mins and secs . |
make_timestamp(year,month,day,hour,min,sec[,timezone]) | Creates a timestamp from year , month , day , hour , min , sec , and timezone fields. |
make_ym_interval([years[, months]]) | Creates a year-month interval from years , and months . |
map([{key1, value1} [, …]]) | Creates a map with the specified key-value pairs. |
named_struct({name1, val1} [, …]) | Creates a struct with the specified field names and values. |
smallint(expr) | Casts the value expr to SMALLINT . |
string(expr) | Casts the value expr to STRING . |
struct(expr1 [, …]) | Creates a STRUCT with the specified field values. |
tinyint(expr) | Casts expr to TINYINT . |
timestamp(expr) | Casts expr to TIMESTAMP . |
to_char(expr, fmt) | Returns expr cast to STRING using formatting fmt .” |
to_date(expr[,fmt]) | Returns expr cast to a date using an optional formatting. |
to_number(expr, fmt) | Returns expr cast to DECIMA L using formatting fmt . |
to_timestamp(expr[,fmt]) | Returns expr cast to a timestamp using an optional formatting. |
to_varchar(expr, fmt) | Returns expr cast to STRING using formatting fmt .” |
try_cast(expr AS type) | Casts the value expr to the target data type type safely. |
try_to_number(expr, fmt) | Returns expr cast to DECIMAL using formatting fmt , or NULL if expr is not a valid. |
CSV functions
Function | Description |
---|---|
from_csv(csvStr, schema[, options]) | Returns a struct value with the csvStr and schema . |
schema_of_csv(csv[, options]) | Returns the schema of a CSV string in DDL format. |
to_csv(expr[, options]) | Returns a CSV string with the specified struct value. |
JSON functions
Function | Description |
---|---|
jsonStr : jsonPath | Returns fields extracted from the jsonStr . |
from_json(jsonStr, schema[, options]) | Returns a struct value with the jsonStr and schema . |
get_json_object(expr, path) | Extracts a JSON object from path . |
json_array_length(jsonArray) | Returns the number of elements in the outermost JSON array. |
json_object_keys(jsonObject) | Returns all the keys of the outermost JSON object as an array. |
json_tuple(jsonStr, path1 [, …]) | Returns multiple JSON objects as a tuple. |
schema_of_json(jsonStr[, options]) | Returns the schema of a JSON string in DDL format. |
to_json(expr[, options]) | Returns a JSON string with the struct specified in expr . |
XPath and XML functions
Function | Description |
---|---|
from_xml(xmlStr, schema[, options]) | Returns a struct value parsed from the xmlStr using schema . |
schema_of_xml(xmlStr[, options]) | Returns the schema of a XML string in DDL format. |
xpath(xml, xpath) | Returns values within the nodes of xml that match xpath . |
xpath_boolean(xml, xpath) | Returns true if the xpath expression evaluates to true , or if a matching node in xml is found. |
xpath_double(xml, xpath) | Returns a DOUBLE value from an XML document. |
xpath_float(xml, xpath) | Returns a FLOAT value from an XML document. |
xpath_int(xml, xpath) | Returns a INTEGER value from an XML document. |
xpath_long(xml, xpath) | Returns a BIGINT value from an XML document. |
xpath_number(xml, xpath) | Returns a DOUBLE value from an XML document. |
xpath_short(xml, xpath) | Returns a SHORT value from an XML document. |
xpath_string(xml, xpath) | Returns the contents of the first XML node that matches the XPath expression. |
AI functions
Function | Description |
---|---|
ai_generate_text(prompt, modelName[, param1, value1] […]) | Returns text generated by a selected large language model (LLM) given the prompt. |
ai_query(endpointName, request, [, param1, value1] […]) | Invokes an existing Databricks Model Serving endpoint and parses and returns its response. |
Read functions
Function | Description |
---|---|
read_files(path, [optionKey => optionValue] [, …]) | Reads data files on cloud storage and returns it in tabular format. |
read_kafka([optionKey => optionValue] [, …]) | Reads records from an Apache Kafka cluster and returns it in tabular format. |
Miscellaneous functions
Function | Description |
---|---|
assert_true(expr) | Returns an error if expr is not true. |
CASE expr { WHEN opt1 THEN res1 } […] [ELSE def] END | Returns resN for the first optN that equals expr or def if none matches. |
CASE { WHEN cond1 THEN res1 } […] [ELSE def] END | Returns resN for the first condN that evaluates to true, or def if none found. |
cloud_files_state( { TABLE(table) | checkpoint } ) | |
coalesce(expr1, expr2 [, …]) | Returns the first non-null argument. |
cube (expr1 [, …]) | Creates a multi-dimensional cube using the specified expression columns. |
current_catalog() | Returns the current catalog. |
current_database() | Returns the current schema. |
current_metastore() | Returns the current Unity Catalog Metastore id. |
current_recipient(key) | Returns a property for the current recipient in a view shared with Delta Sharing. |
current_schema() | Returns the current schema. |
current_user() | Returns the user executing the statement. |
current_version() | Returns the current version of Azure Databricks. |
decode(expr, { key, value } [, …] [,defValue]) | Returns the value matching the key. |
elt(index, expr1 [, …] ) | Returns the nth expression. |
equal_null(expr1, expr2) | Returns true if expr1 equals expr2 or both expressions are NULL , or false otherwise. |
event_log( { TABLE(table) | pipeline_id } ) | |
greatest(expr1, expr2 [, …]) | Returns the largest value of all arguments, skipping null values. |
grouping(col) | Indicates whether a specified column in a GROUPING SET , ROLLUP , or CUBE represents a subtotal. |
grouping_id([col1 [, …]]) | Returns the level of grouping for a set of columns. |
hash(expr1 [, …]) | Returns a hashed value of the arguments. |
hll_sketch_estimate(expr) | Etimates number of distinct values collected in a HyperLogLog sketch. |
hll_union(expr1, expr2 [,allowDifferentLgConfigK]) | Combines two HyperLogLog sketches. |
java_method(class, method[, arg1 [, …]]) | Calls a method with reflection. |
if(cond, expr1, expr2) | Returns expr1 if cond is true , or expr2 otherwise. |
iff(cond, expr1, expr2) | Returns expr1 if cond is true , or expr2 otherwise. |
ifnull(expr1, expr2) | Returns expr2 if expr1 is NULL , or expr1 otherwise. |
input_file_block_length() | Returns the length in bytes of the block being read. |
input_file_block_start() | Returns the start offset in bytes of the block being read. |
input_file_name() | Returns the name of the file being read, or empty string if not available. |
is_account_group_member(group) | Returns true if the current user is a member of group at the account level. |
is_member(group) | Returns true if the current user is a member of group at the workspace level. |
isnull(expr) | Returns true if expr is NULL . |
isnotnull(expr) | Returns true if expr is not NULL . |
least(expr1 [, …]) | Returns the smallest value of all arguments, skipping null values. |
list_secrets() | Returns the keys which the user is authorized to see from Databricks secret service. |
luhn_check(numStr) | Returns true if numStr passes the Luhn algorithm check. |
monotonically_increasing_id() | Returns monotonically increasing 64-bit integers. |
nullif(expr1, expr2) | Returns NULL if expr1 equals expr2 , or expr1 otherwise. |
nvl(expr1, expr2) | Returns expr2 if expr1 is NULL , or expr1 otherwise. |
nvl2(expr1, expr2, expr3) | Returns expr2 if expr1 is not NULL , or expr3 otherwise. |
raise_error(expr) | Throws an exception with expr as the message. |
range(end) | Returns a table of values within a specified range. |
range(start, end [, step [, numParts]]) | Returns a table of values within a specified range. |
reflect(class, method[, arg1 [, …]]) | Calls a method with reflection. |
secret(scope, key) | Extracts a secret value with the given scope and key from Databricks secret service. |
session_user() | Returns the user connected to Azure Databricks. |
spark_partition_id() | Returns the current partition ID. |
sql_keywords() | Returns the set of SQL keywords in Azure Databricks. |
stack(numRows, expr1 [, …]) | Separates expr1 , …, exprN into numRows rows. |
table_changes(table_str, start [, end]) | Returns a log of changes to a Delta Lake table with Change Data Feed enabled. |
try_reflect(class, method[, arg1 [, …]]) | Calls a method with reflection, returning NULL if the method fails. |
typeof(expr) | Return a DDL-formatted type string for the data type of expr . |
user() | Returns the user executing the statement. |
uuid() | Returns an universally unique identifier (UUID) string. |
window(expr, width[, step [, start]]) | Creates a hopping based sliding-window over a timestamp expression. |
xxhash64(expr1 [, …]) | Returns a 64-bit hashed value of the arguments. |
version() | Returns the Apache Spark version. |
Feedback
Submit and view feedback for