नोट
इस पृष्ठ तक पहुंच के लिए प्राधिकरण की आवश्यकता होती है। आप साइन इन करने या निर्देशिकाएँ बदलने का प्रयास कर सकते हैं।
इस पृष्ठ तक पहुंच के लिए प्राधिकरण की आवश्यकता होती है। आप निर्देशिकाएँ बदलने का प्रयास कर सकते हैं।
Applies to:
Databricks Runtime
Spark SQL supports integration of Hive UDFs, UDAFs, and UDTFs. Similar to Spark UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single row as output, while Hive UDAFs operate on multiple rows and return a single aggregated row as a result. In addition, Hive also supports UDTFs (User Defined Tabular Functions) that act on one row as input and return multiple rows as output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, and then use them in Spark SQL queries.
Examples
Hive has two UDF interfaces: UDF and GenericUDF.
An example below uses GenericUDFAbs derived from GenericUDF.
-- Register `GenericUDFAbs` and use it in Spark SQL.
-- Note that, if you use your own programmed one, you need to add a JAR containing it
-- into a classpath,
-- e.g., ADD JAR yourHiveUDF.jar;
CREATE TEMPORARY FUNCTION testUDF AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs';
SELECT * FROM t;
+-----+
|value|
+-----+
| -1.0|
| 2.0|
| -3.0|
+-----+
SELECT testUDF(value) FROM t;
+--------------+
|testUDF(value)|
+--------------+
| 1.0|
| 2.0|
| 3.0|
+--------------+
An example below uses GenericUDTFExplode derived from GenericUDTF.
-- Register `GenericUDTFExplode` and use it in Spark SQL
CREATE TEMPORARY FUNCTION hiveUDTF
AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode';
SELECT * FROM t;
+------+
| value|
+------+
|[1, 2]|
|[3, 4]|
+------+
SELECT hiveUDTF(value) FROM t;
+---+
|col|
+---+
| 1|
| 2|
| 3|
| 4|
+---+
Hive has two UDAF interfaces: UDAF and GenericUDAFResolver.
An example below uses GenericUDAFSum derived from GenericUDAFResolver.
-- Register `GenericUDAFSum` and use it in Spark SQL
CREATE TEMPORARY FUNCTION hiveUDAF
AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum';
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1|
| a| 2|
| b| 3|
+---+-----+
SELECT key, hiveUDAF(value) FROM t GROUP BY key;
+---+---------------+
|key|hiveUDAF(value)|
+---+---------------+
| b| 3|
| a| 3|
+---+---------------+