Applies to: SQL Server 2022 (16.x) Preview
This function returns an approximate interpolated value from the set of values in a group based on percentile value and sort specification. Since this is an approximate function, the output would be within rank based error bound with certain confidence. The percentile value returned by this function is based on a continuous distribution of the column values and the result would be interpolated. Due to this, the output might not be one of values in the data set. One of the common use cases for this function is to avoid the data outliers. This function can be used as an alternative to PERCENTILE_CONT for large datasets where negligible error with faster response is acceptable as compared to accurate percentile value with slow response time.
APPROX_PERCENTILE_CONT (numeric_literal) WITHIN GROUP (ORDER BY order_by_expression [ASC|DESC])
The percentile to compute. The value must range between 0.0 and 1.0.
Specifies a list of numeric values to sort and compute the percentile over. Only one order_by_expression is allowed. The default sort order is ascending (ASC). The expression must evaluate to an exact or approximate numeric type, with no other data types allowed. Exact numeric types are int, bigint, smallint, tinyint, numeric, bit, decimal, smallmoney, and money. Approximate numeric types are float and real.
Any nulls in the data set are ignored.
Approximate percentile functions use KLL sketch. The sketch is built by reading the stream of data. Due to the algorithm used, this function requires less memory than its non-approximate counterpart (PERCENTILE_CONT).
This function provides rank-based error guarantees not value based. The function implementation guarantees up to a 1.33% error.
- The output of the function may not be the same in all executions. The algorithm used for these functions is KLL sketch which is a randomized algorithm. Every time the sketch is built, random values are picked. These functions provide rank-based error guarantees not value based.
- The function implementation guarantees up to a 1.33% error bounds within a 99% confidence.
Under compatibility level 110 and higher, WITHIN GROUP is a reserved keyword. For more information, see ALTER DATABASE Compatibility Level (Transact-SQL).
The following example creates a table, populates it, and executes a sample query.
SET NOCOUNT ON GO DROP TABLE IF EXISTS tblEmployee; GO CREATE TABLE tblEmployee ( EmplId INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, DeptId INT, Salary int); GO INSERT INTO tblEmployee VALUES (1, 31),(1, 33), (1, 18), (2, 25),(2, 35),(2, 10), (2, 10),(3, 1), (3,NULL), (4,NULL), (4,NULL); GO SELECT DeptId, APPROX_PERCENTILE_CONT(0.10) WITHIN GROUP(ORDER BY Salary) AS 'P10', APPROX_PERCENTILE_CONT(0.90) WITHIN GROUP(ORDER BY Salary) AS 'P90' FROM tblEmployee GROUP BY DeptId;