Note
Kailangan ng pahintulot para ma-access ang page na ito. Maaari mong subukang mag-sign in o magpalit ng mga direktoryo.
Ang pag-access sa pahinang ito ay nangangailangan ng pahintulot. Maaari mong subukang baguhin ang mga direktoryo.
Applies to:
Databricks SQL
Databricks Runtime
Parameter markers are named or unnamed typed placeholder variables used to supply values from the API invoking the SQL statement.
Using parameter markers protects your code from SQL injection attacks since it clearly separates provided values from the structure of the SQL statements.
You cannot mix named and unnamed parameter markers in the same SQL statement.
You can also use parameter markers in the IDENTIFIER clause, which can be used to parameterize object names. See IDENTIFIER clause.
Parameter markers can be provided by:
- Databricks SQL through the Statement Execution API.
- Python using its pyspark.sql.SparkSession.sql() API.
- Scala using its org.apache.spark.sql.SparkSession.sql() API.
- Java using its org.apache.spark.sql.SparkSession.sql() API.
The following rules apply:
Applies to:
Databricks SQL
Databricks Runtime 17.3 LTS and prior- You may reference a parameter marker in an expression
- You must not reference a parameter marker in a DDL statement, such as a generated column or
DEFAULTdefinition, a view, or a SQL function. Exceptions are references to parameter markers in theIDENTIFIERclause, which can be used to parameterize the subject of certain DDL statements. See IDENTIFIER clause.
Applies to:
Databricks Runtime 18.0 and above- You can reference a parameter marker wherever you can use a literal of the parameter marker's type. This lifts the preceding DDL restriction, allowing parameter markers in generated columns,
DEFAULTdefinitions, views, SQL functions, and string-valued DDL clauses such asLOCATION.
- You can reference a parameter marker wherever you can use a literal of the parameter marker's type. This lifts the preceding DDL restriction, allowing parameter markers in generated columns,
Named parameter markers
Applies to: Databricks Runtime
12.1 and above
Named parameter markers are typed placeholder variables. The API invoking the SQL statement must supply name-value pairs to associate each parameter marker with a value.
Syntax
:parameter_name
Parameters
-
A reference to a supplied parameter marker in form of an unqualified identifier.
Notes
You can reference the same parameter marker multiple times within the same SQL Statement. If no value has been bound to the parameter marker an UNBOUND_SQL_PARAMETER error is raised. You are not required to reference all supplied parameter markers.
The mandatory preceding : (colon) differentiates the namespace of named parameter markers from that of column names and SQL parameters.
Examples
The following example defines two parameter markers:
- later: An
INTERVAL HOURwith value 3. - x: A
DOUBLEwith value 15.0
x is referenced multiple times, while later is referenced once.
SQL
> DECLARE stmtStr = 'SELECT current_timestamp() + :later, :x * :x AS square';
> EXECUTE IMMEDIATE stmtStr USING INTERVAL '3' HOURS AS later, 15.0 AS x;
2024-01-19 16:17:16.692303 225.00
Scala
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder()
.appName("Spark named parameter marker example")
.getOrCreate()
val argMap = Map("later" -> java.time.Duration.ofHours(3), "x" -> 15.0)
spark.sql(
sqlText = "SELECT current_timestamp() + :later, :x * :x AS square",
args = argMap).show()
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// | 2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+
Java
import org.apache.spark.sql.*;
import static java.util.Map.entry;
SparkSession spark = SparkSession
.builder()
.appName("Java Spark named parameter marker example")
.getOrCreate();
Map<String, String> argMap = Map.ofEntries(
entry("later", java.time.Duration.ofHours(3)),
entry("x", 15.0)
);
spark.sql(
sqlText = "SELECT current_timestamp() + :later, :x * :x AS square",
args = argMap).show();
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// | 2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+
Python
spark.sql("SELECT current_timestamp() + :later, :x * :x AS square",
args = { "later": datetime.timedelta(hours=3), "x": 15.0 }).show()
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// | 2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+
Applies to: Databricks Runtime
18.0 and above
> EXECUTE IMMEDIATE 'SELECT 1::DECIMAL(:precision, :scale)' USING 6 AS precision, 4 AS scale;
1.0000
> EXECUTE IMMEDIATE 'CREATE VIEW v(c1 INT) AS SELECT :val AS c1' USING 10 AS val;
> SELECT * FROM v;
10
> EXECUTE IMMEDIATE 'CREATE TABLE T(c1 INT DEFAULT :def COMMENT \'This is a \' :com)' USING 17 AS def, 'comment' AS com;
Unnamed parameter markers
Applies to: Databricks Runtime
13.3 and above
Unnamed parameter markers are typed placeholder variables. The API invoking the SQL statement must supply an array of arguments to associate each parameter marker with a value in the order in which they appear.
Syntax
?
Parameters
?: A reference to a supplied parameter marker in form of a question mark.
Notes
Each occurrence of an unnamed parameter marker consumes a value provided by the API invoking the SQL statement in order. If no value has been bound to the parameter marker, an UNBOUND_SQL_PARAMETER error is raised. You are not required to consume all provided values.
Examples
The following example defines three parameter markers:
- An
INTERVAL HOURwith value 3. - Two
DOUBLEwith value 15.0 each.
Since the parameters are unnamed each provided value is consumed by at most one parameter.
SQL
> DECLARE stmtStr = 'SELECT current_timestamp() + ?, ? * ? AS square';
> EXECUTE IMMEDIATE stmtStr USING INTERVAL '3' HOURS, 15.0, 15.0;
2024-01-19 16:17:16.692303 225.00
Scala
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder()
.appName("Spark unnamed parameter marker example")
.getOrCreate()
val argArr = Array(java.time.Duration.ofHours(3), 15.0, 15.0)
spark.sql(
sqlText = "SELECT current_timestamp() + ?, ? * ? AS square", args = argArr).show()
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// | 2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+
Java
import org.apache.spark.sql.*;
SparkSession spark = SparkSession
.builder()
.appName("Java Spark unnamed parameter marker example")
.getOrCreate();
Object[] argArr = new Object[] { java.time.Duration.ofHours(3), 15.0, 15.0 }
spark.sql(
sqlText = "SELECT current_timestamp() + ?, ? * ? AS square",
args = argArr).show();
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// | 2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+
Python
spark.sql("SELECT ? * ? * ? AS volume", args = [ 3, 4, 5 ]).show()
// +------+
// |volume|
// +------+
// | 60|
// +------+
Applies to: Databricks Runtime
18.0 and above
> EXECUTE IMMEDIATE 'SELECT 1::DECIMAL(?, ?)' USING 6, 4;
1.0000
> EXECUTE IMMEDIATE 'CREATE VIEW v(c1 INT) AS SELECT ? AS c1' USING 10;
> SELECT * FROM v;
10
> EXECUTE IMMEDIATE 'CREATE TABLE T(c1 INT DEFAULT ? COMMENT \'This is a \' ?)' USING 17, 'comment';
Parameter markers in DDL string clauses
Some DDL clauses, such as the LOCATION clause in CREATE TABLE, accept string literals rather than identifiers. You cannot use the IDENTIFIER clause for these clauses because they aren't object names.
Applies to: Databricks Runtime
18.0 and above
In Databricks Runtime 18.0 and above, you can use parameter markers directly in these clauses because Databricks Runtime supports parameter markers wherever it accepts a literal of the same type. For example:
SQL
> CREATE EXTERNAL TABLE my_table USING DELTA LOCATION :path;
Python
spark.sql(
"CREATE EXTERNAL TABLE my_table USING DELTA LOCATION :path",
args = {"path": "abfss://container@account.dfs.core.windows.net/data"})
Scala
val argMap = Map("path" -> "abfss://container@account.dfs.core.windows.net/data")
spark.sql(
sqlText = "CREATE EXTERNAL TABLE my_table USING DELTA LOCATION :path",
args = argMap)
Java
Map<String, String> argMap = Map.ofEntries(
entry("path", "abfss://container@account.dfs.core.windows.net/data")
);
spark.sql(
sqlText = "CREATE EXTERNAL TABLE my_table USING DELTA LOCATION :path",
args = argMap);
For Databricks Runtime versions prior to 18.0
Applies to:
Databricks SQL
Databricks Runtime 14.3 to 17.3 LTS
In versions prior to Databricks Runtime 18.0, Databricks Runtime doesn't allow parameter markers directly in DDL statements (except through the IDENTIFIER clause). You can use EXECUTE IMMEDIATE to build the SQL statement dynamically, concatenating the path value as a string literal:
> DECLARE path STRING DEFAULT 'abfss://container@account.dfs.core.windows.net/data';
> EXECUTE IMMEDIATE 'CREATE EXTERNAL TABLE my_table USING DELTA LOCATION \'' || path || '\'';
Note
You cannot embed a parameter marker inside a string literal (for example, 'abfss://:param/path'). Instead, pass the entire string as a single parameter, or use string concatenation to build the value before passing it. For example, use SET VARIABLE with CONCAT() to build the full path in a variable, then pass the variable to EXECUTE IMMEDIATE.