I
have to admit that I am totally hooked on the The Rozenshtein Method. My
buddy, Richard Campbell
showed it to me a year or two ago and I have been hooked ever since. I
recently demoed it at TechED in Dallas, a recent
WebCast, VSLive in New York and will be showing it off at
TechEd in
Malaysia next week. I have gotten lots of email and positive feedback
so I decided to blog it here.
Here
is how it works. You need a crosstab query. You have to move rows
into columns. You also need ANSI 92 SQL that will
run in any database.Well there are several ways to do this, but the most
generic and one of the most powerful ways is called the Rozenshtein
Method, which was developed by the Russian mathematician David
Rozenshtein. This technique was taken from his book: Optimizing
Transact-SQL : Advanced Programming Techniques.
First
let’s look at the desired results. We want to take the orders data from
Northwind and pivot the sales date (aggregated by month) as columns with
the sum of the total sales in the row grouped by customer. It would look
something like this:
CompanyName
TotalAmount Jan Feb Mar…(etc)
Company1
100
25 33 10
Company2 467
76 62
87
(etc)
The TSQL query to do this
is, go ahead and run it in Northwind in SQL Server:
SELECT CompanyName,
SUM((UnitPrice*Quantity)) As TotalAmt,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-1))))
AS Jan,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-2))))
AS Feb,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-3))))
AS Mar,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-4))))
AS Apr,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-5))))
AS May,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-6))))
AS Jun,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-7))))
AS Jul,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-8))))
AS Aug,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-9))))
AS Sep,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-10))))
AS Oct,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-11))))
AS Nov,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-12))))
AS Dec
FROM Customers INNER
JOIN
Orders ON Customers.CustomerID = Orders.CustomerID INNER JOIN
[Order Details] ON Orders.OrderID = [Order Details].OrderID
Group By
Customers.CompanyName
So how does this work?
This method uses Boolean
aggregates, so that each column has a numeric expression that resolves
each row as a zero or one and that value (0 or 1) is multiplied by your
numeric expression (Like TotalSales or (UnitPrice*Quantity). That is all
there is to it, quite simple. But wait, there’s more to explain:
We want to create columns
for each Month in our data. To find a month use DatePart. But we need to
subtract the DatePart value (1-12) from the amount you’re looking for (1
for Jan, 2 for Feb, etc) as shown here for January:
DatePart(mm,OrderDate)-1
So that true = zero, false
> 0 or < 0. For example if the month you were looking for was
January and the DatePart was 1 and you subtract 1 from that value you get
0, which is true. If you are looking for March you would get -2 and that
would be false.
Next you have to compute
the sign of the expression and get the absolute value like so:
ABS(SIGN(DatePart(mm,OrderDate)-1)))
This will give us a
positive value. Remember 0 is still true. Now subtract the value computed
from 1 in order to get a 0 or 1 from the value of your expression (the
Boolean aggregate). The code is:
(1-ABS(SIGN(DatePart(mm,OrderDate)-1))))
For example if you had
March return 3 from the Datepart, 3-1=2 and 1-2 =-1. The absolute value is
1. This will always return 0 or 1. If your expression was zero, the value
is now one. If was one, the value is zero.
Last step. Taking the SUM
of the Boolean values will give you a count of the values that qualify. So
you can find out how many sales you made in Jan, Feb, etc. So now multiply
the value by the price and quantity, but remember its now one = true. Take
a look here:
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-1))))
AS Jan
If its zero, nothing gets
added, if its one, you get the value of the sale. The sum of the total
expression is the total of sales for the month. If you have a DatePart
that is evaluated to 0 then ((UnitPrice*Quantity)*0) is 0 and those
results are ignored in the SUM. If you have a month that matches your
expression resolves to 1 and ((UnitPrice*Quantity)*1) is the value of the
sale.
How easy!
But wait, there’s more!
Suppose you wanted two values combined? Compute each value down to zero or
one separately. Now you can use AND by multiplying, OR
by adding (and reduce to 1 or 0 using SIGN).
Ok, have fun!!!