CAST_INVALID_INPUT 错误条件

SQLSTATE:22018

<expression> 类型的值 <sourceType> 无法强制转换为 <targetType>,因为其格式错误。 根据语法更正值,或更改其目标类型。 使用 try_cast 来容忍格式不正确的输入,并返回 NULL。 如有必要,设置为 <ansiConfig> “false”以绕过此错误。

参数

  • 表达式:需要被强制转换为targettype的表达式
  • sourceTypeexpression的数据类型。
  • targetType:转换操作的目标类型。
  • ansiConfig:用于更改 ANSI 模式的配置设置。

Explanation

expression无法转换为targetType,原因可能是以下之一:

  • expression 对于类型的值域来说太大。 例如,无法将数字 1000 强制转换为 TINYINT ,因为该域仅从范围 -128+127
  • expression 包含不属于类型的字符。 例如 a ,不能强制转换为任何数值类型。
  • expression 的格式无法被强制转换操作分析。 例如 1.01e1 不能强制转换为任何整型数值类型。

类型转换可能没有被显式指定,但可能已被 Azure Databricks 隐式注入。

此错误提供的上下文信息将隔离发生错误的对象和表达式。

有关域的定义和接受的文本格式,请参阅数据类型tyopeName的定义。

缓解措施

此错误的缓解措施取决于原因:

  • 是否 value 应符合指定 typeName域和格式?

    验证输入生成值并更正数据源。

  • 演员的目标太窄吗?

    通过例如将DATE移动到TIMESTAMPINT移动到BIGINTDOUBLE的方式来扩大类型。

  • 格式 value 不正确?

    请考虑使用:

    这些函数允许指定各种格式。

    将具有小数点(例如 1.0)或科学计数法(例如 1e0)的数字文本进行强制转换时,建议先将其强制转换为 DECIMALDOUBLE,然后再转换为精确类型的数字。

  • 是否预期可能出现值不正确的数据,并通过生成 NULL 来容忍这些数据?

    更改表达式以使用或注入 try_cast(value AS typeName)。 当没有满足类型的 value 时,此函数返回 NULL

    如果无法更改表达式,可以最后尝试使用 ansiConfig 暂时禁用 ANSI 模式 ansiConfig

例子

-- A view with a cast and string literals outside the domain of the target type
> CREATE OR REPLACE TEMPORARY VIEW v(c1) AS SELECT CAST(a AS SMALLINT) FROM VALUES('100'), ('50000') AS t(a);
> SELECT c1 FROM v;
 [CAST_INVALID_INPUT] The value '50000' of the type "STRING" cannot be cast to "SMALLINT" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL of VIEW v(line 1, position 8) ==
 SELECT CAST(a AS SMALLINT) FROM VALUES('100'), ('50000') A...
        ^^^^^^^^^^^^^^^^^^^

-- Widen the target type to match the domain of the input
> CREATE OR REPLACE TEMPORARY VIEW v(c1) AS SELECT cast(a AS INTEGER) FROM VALUES('100'), ('50000') AS t(a);
> SELECT c1 FROM v;
 100
 50000

-- The input data format does not match the target type
> SELECT cast(a AS INTEGER) FROM VALUES('1.0'), ('1e0') AS t(a);
 [CAST_INVALID_INPUT] The value '1.0' of the type "STRING" cannot be cast to "INT" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT cast(a AS INTEGER) FROM VALUES('1.0'), ('1e0') AS ...
        ^^^^^^^^^^^^^^^^^^

-- Adjust the target type to the match the format if the format is indicative of the domain.
> SELECT cast(a AS DOUBLE) FROM VALUES('1.0'), ('1e0') AS t(a);
 1.0
 1.0

-- ALternatively double cast to preserver the target type
> SELECT cast(cast(a AS DOUBLE) AS INTEGER) FROM VALUES('1.0'), ('1e0') AS t(a);
 1
 1

-- The format of the numeric input contains display artifacts
> SELECT cast(a AS DECIMAL(10, 3)) FROM VALUES('12,345.30-'), ('12+') AS t(a);
 [CAST_INVALID_INPUT] The value '12,345.30-' of the type "STRING" cannot be cast to "DECIMAL(10,3)" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT cast(a AS DECIMAL(10, 3)) FROM VALUES('$<123,45.30>'), ('...
        ^^^^^^^^^^^^^^^^^^^^^^^^^

-- Use to_number() to parse formatted values
> SELECT to_number(a, '9,999,999.999S') FROM VALUES('123,45.30-'), ('12+') AS t(a);
 -12345.300
 12.000

-- The format of a date input does not match the default format
> SELECT cast(geburtsdatum AS DATE) FROM VALUES('6.6.2000'), ('31.10.1970') AS t(geburtsdatum);
 [CAST_INVALID_INPUT] The value '6.6.2000' of the type "STRING" cannot be cast to "DATE" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT cast(geburtsdatum AS DATE) FROM VALUES('6.6.2000'), ('31.1...
        ^^^^^^^^^^^^^^^^^^^^^^^^^^

-- Use to_date to parse the correct input format for a date
> SELECT to_date(geburtsdatum, 'dd.MM.yyyy') FROM VALUES('6.6.2000'), ('31.10.1970') AS t(geburtsdatum);
  2000-06-06
  1970-10-31

-- The type resolution of Databricks did not derive a sufficiently wide type, failing an implicit cast
> SELECT 12 * monthly AS yearly FROM VALUES ('1200'), ('1520.56') AS t(monthly);
 [CAST_INVALID_INPUT] The value '1520.56' of the type "STRING" cannot be cast to "BIGINT" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
 == SQL(line 1, position 8) ==
 SELECT 12 * monthly AS yearly FROM VALUES ('1200'),...
        ^^^^^^^^^^^^

-- Explicitly declare the expected type
> SELECT 12 * cast(monthly AS DECIMAL(8, 2)) AS yearly FROM VALUES ('1200'), ('1520.56') AS t(monthly);
 14400.00
 18246.72

-- The input data is occasionally expected to incorrect
> SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 [CAST_INVALID_INPUT] The value 'prefer not to say' of the type "STRING" cannot be cast to "DECIMAL(9,2)" because it is malformed.
 Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead.
 If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer ...
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-- Use try_cast to tolerate incorrect input
> SELECT try_cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 30000.00
 NULL

-- In Databricks SQL temporarily disable ANSI mode to tolerate incorrect input.
> SET ANSI_MODE = false;
> SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 30000.00
 NULL
> SET ANSI_MODE = true;

-- In Databricks Runtime temporarily disable ANSI mode to tolerate incorrect input.
> SET spark.sql.ansi.enabled = false;
> SELECT cast(salary AS DECIMAL(9, 2)) FROM VALUES('30000'), ('prefer not to say') AS t(salary);
 30000.00
 NULL
> SET spark.sql.ansi.enabled = true;