TSQL - Solve it YOUR Way - Finding the MAX Value from Multiple Columns
Introduction:
As part of the blog series TSQL - Solve it YOUR Way, today's topic will cover a question recently asked in the Transact-SQL MSDN Forums here, followed by four different solutions from three of the more helpful and creative contributors in the TSQL MSDN forums, Jingyang Li, Steven Wang, and Kent Waldrop.
Topic: I have a table with three datetime columns and want to return the most recent date for each row.
In the following image, we see an example of our three datetime columns, Date1, Date2, and Date3, followed by a "MaxDate" column containing the latest date of the three. The following solutions will show us various ways to calculate this "MaxDate" value.
This is a great question for our blog series because there are so many ways to tackle this question. In fact, the forum post included his original solution using a CASE statement, but he was looking for additional ways to solve the problem. Let's take a look at four additional ways to accomplish this.
Solution #1 and Solution #2 - Provided by Jingyang Li
Code Snippet
create table Dates (id int identity(1,1), date1 datetime, date2 datetime, date3 datetime)
insert into Dates (date1,date2,date3)
values ('1/1/2012','1/2/2012','1/3/2012'),
('1/1/2012',null,'1/13/2012'),
('1/1/2012','1/2/2012',null),
('8/30/2012','9/10/2012','1/1/2013'),
(null,null,null)
--Solution #1 - UNPIVOT approach
SELECT a.id, b.latestdate FROM Dates a LEFT JOIN (SELECT id, Max(date123) AS latestdate
FROM (SELECT id,date1,date2, date3 FROMDates) src
UNPIVOT (date123
FOR dates IN ([date1],
[date2],
[date3])) unpvt GROUP BY id ) b on a.id=b.id
--Solution #2 - Using the new SQL Server 2012 IIF function
SELECT
NULLIF(IIF (date1>date2,
IIF(date1>date3,date1,date3),IIF(date2>date3,date2,date3)),
'1/1/1900') latestdate
FROM (
SELECT id,Isnull(date1, '1/1/1900') date1, Isnull(date2, '1/1/1900') date2, Isnull(date3, '1/1/1900') date3 FROM Dates) t
Explanation of Jingyang's solutions:
Solution #1: My first thought was to unpivot these columns and use max function to get the latest date for each row. Since you cannot easily take the max value of multiple columns in a table, we can use the unpivot function to accomplish this. To include all NULL dates, we can use a LEFT JOIN to generate the result set.
Solution #2: Later on I found that we can use SQL Server 2012 inline IIF function to compare all cases to get the latest date. Within the query, we use ISNULL and NULLIF to handle NULL values.
Solution #3 - Provided by Steven Wang
Code Snippet
--Test Data Preparation
Declare @MyTable Table (ID Int Identity(1,1), Date1 Date, Date2 Date, Date3 Date);
Insert Into @MyTable
Values ('2012-09-12','2012-07-25','2012-06-06')
, ('2013-03-22','2012-06-03','2013-03-11')
, (NULL,NULL,NULL)
, ('2012-10-22','2012-12-30','2012-05-29')
, (NULL,'2012-08-07',NULL)
, ('2012-05-05','2012-08-03','2012-09-16')
--Actual Solution
Select ID, Date1, Date2, Date3, MaxDate
from @MyTable A
Cross Apply
( Select Max(MyDate) As MaxDate
From (Values (Date1), (Date2), (Date3)) B(MyDate)
) C
Explanation of Steven's solution:
As we need to find the latest date among 3 columns, the method using the value comparison between 2 different columns is a little clumsy. How about if we need to do the same when we have
more than 3 columns to compare?
One logical way to perform this comparison is to UNPIVOT the 3 columns’ values into one column and then use the MAX aggregation function to get the latest date value.
As Jingyang showed, it is easy to use the UNPIVOT clause to transpose the multiple column data into one row. However, UNPIVOT normally has performance issues so we will look for a different solution.
Here, I will use a table value constructor to perform the UNPIVOT. In SQL Server 2008 or later, we can use VALUES clause in a standard way as a table value constructor to create a derived table.
The idea is to pass three base table date columns into a derived table constructed by VALUES clause by using the CROSS APPLY clause. Each column passed into the derived table creates a row. We can then perform an MAX aggregation to get the latest date for each row from the base table inside the derived table.
This technically is fun and performance-wise will be better than using the UNPIVOT clause to transpose columns and perform aggregation.
Solution #4 - Provided by Kent Waldrop
Code Snippet
declare @test table
(test_Id int, date1 date, date2 date, date3 date);
insert into @test
select 1, null, null, null union all
select 2, null, '20120102', null union all
select 3, '20120101', '20120102', '20120103' union all
select 4, '20130322', '20120603', '20130311' union all
select 5, '20120912', '20120725', '20120606'
;
select
test_Id,
case when date12 >= coalesce(date3, date12) then date12 else date3
end as greatest_Date
from @test
cross apply
( select case when date1 >= coalesce(date2, date1) then date1 else date2
end as date12
) as xa1;
Explanation of Kent's solution:
My solution uses a CROSS APPLY to provide re-use of a derived column "date12". Date12 is the greater of date1 and date2.
A key trick that is involved with this query is the following two comparisons:
date1 >= coalesce(date2, date1)
date12 >= coalesce(date3, date12)
These two comparisons above are logically equivalent to these two comparisons:
date1 >= date2 or date1 is not null and date2 is null
date12 >= date3 or date12 is not null and date3 is null
The greatest date is selected by fetching the greater of date1 and date2 in the cross apply and then selecting the greater of date3 and the results of the date that was selected in the cross apply.
The main ideas driving this particular form of the query are (1) to make my additional operator a "Compute Scalar" operation rather than something else and (2) aim at code that is small in size to make the code easier to understand.
Other alternatives:
• Use a CTE or a derived table instead of the APPLY operator since the APPLY operator is not ANSI standard
• Use the expanded OR / AND comparisons rather than COALESCE
Conclusion:
As you can see, all four of the above solutions provide the result we were looking for, but do so in creatively different styles. Each of these solutions leverages different SQL Server language constructs and includes different considerations in the final solutions, including one solution that leverage new constructs from SQL Server 2012. I hope that you are able to learn a lot by trying out the problem yourself and then reading through the additional solutions.
Special thanks to Jingyang, Steven, and Kent for their valuable forums contribution and for contributing to this series!
Hope that helps,
Sam Lester (MSFT)
Contributor Bios:
Steven Wang has worked with SQL server for more than 10 years. Steven is an active SQL server community participant and speaks at events such as TechEd, CodeCamp, SQL User Group etc.
Blog: www.MSBICOE.com | LinkedIn: Steven Wang | MSDN Profile: Steven Wang - Shangzhou
Jingyang Li has been working with SQL Server since he began his IT career as an ASP.NET developer in 2001. He enjoys working with T-SQL and recently took a full time job as a DBA. He has been participating in the Microsoft forums with alias Limno.
Kent Waldrop started working with Sybase Transact SQL in 1989 as an application developer and continued working with Sybase until 1995 when SQL Server 6 was released. At that time, he became a Microsoft SQL Server database administrator and has continued to work with Microsoft SQL Server ever since. Currently he is a database architect working with Microsoft SQL Server, Oracle and UDB/DB2.
Comments
Anonymous
September 10, 2012
The comment has been removedAnonymous
September 11, 2012
Hi Adrian, this is great, thanks for the additional solution. Where would be the best place to handle the NULL dates? In the CTE or in the XQuery SELECT statement? Thanks again, Sam Note: To run Adrian's solution on your own, you can use the table creation and sample data from Steven's solution, solution #3.Anonymous
September 11, 2012
How about: select test_id, ( select max(dDate) from ( select date1 [dDate] union all select date2 union all select date3 ) as [d] ) as [greatest_Date] from @testAnonymous
September 11, 2012
This is yet another good way to solve the problem. What I like about this particular problem is that there seem to be many good ways to approach this problem and I especially like the pooling of ideas in the source post question and discussion in the MSDN forums. For instance, I am definitely a fan of the IIF function that became available in SQL Server 2012. I really liked Jingyang's solution using the IIF function when I read it in the forum. Consequently, I posted another potential solution that blended Jingyang's IIF functions with my COALESCE functions. It has been a fun discussion so far. :)Anonymous
September 13, 2012
It's close, but I think Steven had the shortest solution. =^) This is a great problem... a lot of different solutions.Anonymous
September 15, 2012
Yet another way to do it...blog back in 2008 www.texastoo.com/.../TSQL-UNPIVOT.aspxAnonymous
June 26, 2013
Another good example. SELECT Max(v) FROM (VALUES ('2010-10-13 00:00:00.000'), ('2013-06-08 00:00:00.000'), ('2079-06-06 00:00:00.000')) AS value(v)Anonymous
August 06, 2015
Great Post!Anonymous
March 02, 2017
Thanks for sharing this, I was struggling with temporary tables.