CREATE FUNCTION （外部）

發行項
07/05/2024

適用於： 檢查標示為是 Databricks Runtime

建立暫時或永久的外部函式。暫存函式的範圍是在會話層級，因為永久函式是在永續性目錄中建立，而且可供所有會話使用。子句中指定的 USING 資源會在第一次執行時提供給所有執行程式使用。

除了 SQL 介面之外，Spark 還可讓您使用 Scala、Python 和 Java API 來建立自定義使用者定義純量和聚合函數。如需詳細資訊，請參閱外部使用者定義純量函數（UDF）和使用者定義聚合函數（UDAFs）。

語法

CREATE [ OR REPLACE ] [ TEMPORARY ] FUNCTION [ IF NOT EXISTS ]
    function_name AS class_name [ resource_locations ]

參數

或 REPLACE

如果指定，則會重載函式的資源。這主要適用於挑選對函式實作所做的任何變更。這個參數互斥， IF NOT EXISTS 而且不能一起指定。
TEMPORARY

表示正在建立的函式範圍。指定時 TEMPORARY ，所建立的函式有效且顯示在目前的會話中。這類函式的目錄中不會有任何持續性專案。
如果不存在

如果指定，則只會在函式不存在時建立函式。如果指定的函式已存在於系統中，則函式的建立會成功（不會擲回錯誤）。這個參數互斥， OR REPLACE 而且不能一起指定。
function_name

函式的名稱。函式名稱可能選擇性地以架構名稱限定。

在中 hive_metastore 建立的函式只能包含英數位元 ASCII 字元和底線。
class_name

類別的名稱，提供要建立之函式的實作。實作類別應該擴充其中一個基類，如下所示：
- 應該擴充或UDAF封裝UDF中org.apache.hadoop.hive.ql.exec。
- 應該在封裝中org.apache.hadoop.hive.ql.udf.generic擴充AbstractGenericUDAFResolver、 GenericUDF或 GenericUDTF 。
- 應該在套件中org.apache.spark.sql.expressions擴充UserDefinedAggregateFunction。
resource_locations

包含函式實作及其相依性的資源清單。

語法：USING { { (JAR | FILE | ARCHIVE) resource_uri } , ... }

範例

-- 1. Create a simple UDF `SimpleUdf` that increments the supplied integral value by 10.
--    import org.apache.hadoop.hive.ql.exec.UDF;
--    public class SimpleUdf extends UDF {
--      public int evaluate(int value) {
--        return value + 10;
--      }
--    }
-- 2. Compile and place it in a JAR file called `SimpleUdf.jar` in /tmp.

-- Create a table called `test` and insert two rows.
> CREATE TABLE test(c1 INT);
> INSERT INTO test VALUES (1), (2);

-- Create a permanent function called `simple_udf`.
> CREATE FUNCTION simple_udf AS 'SimpleUdf'
    USING JAR '/tmp/SimpleUdf.jar';

-- Verify that the function is in the registry.
> SHOW USER FUNCTIONS;
           function
 ------------------
 default.simple_udf

-- Invoke the function. Every selected value should be incremented by 10.
> SELECT simple_udf(c1) AS function_return_value FROM t1;
 function_return_value
 ---------------------
                    11
                    12

-- Created a temporary function.
> CREATE TEMPORARY FUNCTION simple_temp_udf AS 'SimpleUdf'
    USING JAR '/tmp/SimpleUdf.jar';

-- Verify that the newly created temporary function is in the registry.
-- The temporary function does not have a qualified
-- schema associated with it.
> SHOW USER FUNCTIONS;
           function
 ------------------
 default.simple_udf
    simple_temp_udf

-- 1. Modify `SimpleUdf`'s implementation to add supplied integral value by 20.
--    import org.apache.hadoop.hive.ql.exec.UDF;

--    public class SimpleUdfR extends UDF {
--      public int evaluate(int value) {
--        return value + 20;
--      }
--    }
-- 2. Compile and place it in a jar file called `SimpleUdfR.jar` in /tmp.

-- Replace the implementation of `simple_udf`
> CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR'
    USING JAR '/tmp/SimpleUdfR.jar';

-- Invoke the function. Every selected value should be incremented by 20.
> SELECT simple_udf(c1) AS function_return_value FROM t1;
 function_return_value
 ---------------------
                    21
                    22

共用方式為

CREATE FUNCTION （外部）

語法

參數

範例

意見反應

其他資源

共用方式為

CREATE FUNCTION （外部）

語法

參數

範例

相關文章

意見反應

其他資源