非匯總表達式 <expression> 是以未參與 GROUP BY 子句的列為基礎。
將數據行或表示式新增至 GROUP BY、匯總表達式,或如果您不在意群組內傳回哪些值,請使用 <expressionAnyValue>。
參數
-
表示式:
SELECT清單中的非匯總、非群組表達式。 -
expressionAnyValue:
expression包裝在 any_value() 聚合函數中。
解釋
在具有 GROUP BY 子句的查詢中,SELECT 名單中的本機欄位參考必須符合以下要求:
- 作為 聚合函數的參數使用,或
- 符合
GROUP BY子句上表達式的表達式的一部分。
在查詢的FROM子句中已解析為數據表參考的欄位即為本地欄位參考。
換句話說:欄位參考必須是群組索引鍵的一部分,或者它們必須是聚合的一部分。
Azure Databricks 會在「盡力而為」的基礎上比對表達式:例如,它會將 辨識為符合的表達式。
但 SELECT c1 FROM T GROUP BY c1 + 5 不相符。
減緩
錯誤的緩解取決於原因:
您是否錯過了分組欄?
將
expression,或expression的相關子運算式添加至GROUP BY子句。欄位參考是否是與
GROUP BY不同的epression表示式的一部分?比對
SELECT清單中的運算式,或簡化GROUP BY運算式。您是否錯過了匯總?
使用聚合函數包裝數據行參考。 如果您只想要群組中的代表性值,您可以使用 any_value(epression)。
例子
-- Sample data
> CREATE OR REPLACE TEMPORARY VIEW tasks(name, firstname, task, cost) AS
VALUES ('Smith' , 'Sam' , 'UNPIVOT', 10),
('Smith' , 'Sam' , 'LATERAL', 5),
('Shuster', 'Sally' , 'DELETE' , 7),
('Shuster', 'Sally' , 'GRANT' , 8);
-- `name` and `firstname` are part of the group by coumns, but incomplete
> SELECT name, firstname, sum(cost) FROM tasks GROUP BY firstname || ' ' || name;
[MISSING_AGGREGATION] The expression "name" is neither present in the group by, nor is it an aggregate function.
-- Match the GROUP BY expression
> SELECT firstname || ' ' || name, sum(cost) FROM tasks GROUP BY firstname || ' ' || name;
Sam Smith 15
Sally Shuster 15
-- Break up the GROUP BY expression
> SELECT firstname, name, sum(cost) FROM tasks GROUP BY firstname, name;
Sam Smith 15
Sally Shuster 15
-- Missing grouping column
> SELECT name, firstname, sum(cost) FROM tasks GROUP BY name;
[MISSING_AGGREGATION] The expression "firstname" is neither present in the group by, nor is it an aggregate function.
-- Add the grouping column
> SELECT firstname, name, sum(cost) FROM tasks GROUP BY firstname, name;
Sam Smith 15
Sally Shuster 15
-- Missing aggregate
> SELECT firstname, name, sum(cost), task FROM tasks GROUP BY firstname, name;
[MISSING_AGGREGATION] The expression "task" is neither present in the group by, nor is it an aggregate function.
-- Add an aggregate
> SELECT firstname, name, sum(cost), array_agg(task) FROM tasks GROUP BY firstname, name;
Sam Smith 15 ["UNPIVOT","LATERAL"]
Sally Shuster 15 ["DELETE","GRANT"]
-- Return any task
> SELECT firstname, name, sum(cost), any_value(task) FROM tasks GROUP BY firstname, name;
Sam Smith 15 LATERAL
Sally Shuster 15 DELETE