In mathematics, discretization concerns the process of transferring continuous models and equations into discrete counterparts.
In other words, when someone talks about how a television show did great in the all-important, 25-44 market, they are practicing discretization. Of course in the BI world, the harder and more complex we make things sound, the better.
Or, to put it another way, discretization = GROUP BY. Yeah, I know, it’s used in SSAS, and it’s really much more involved than this, but, for the purposes of this article, and a few that follow it, it’s a GROUP BY.
So lets discretize some data (tell me that doesn’t sound cooler than putting data into categories) by date range.
>USE adventureworks2012 GO SELECT YEAR(orderdate) AS orderDate, CAST(SUM(totaldue) AS DECIMAL(18,2)) AS totalDue, COUNT(*) AS recordCount, CAST(AVG(totaldue) AS DECIMAL(18,2)) AS averageSale FROM sales.salesorderheader GROUP BY YEAR(orderdate) ORDER BY orderDate orderDate totalDue recordCount averageSale 2005 12693250.63 1379 9204.68 2006 34463848.44 3692 9334.74 2007 47171489.55 12443 3791.01 2008 28888197.51 13951 2070.69
Much to my surprise, this discretization was useful, because as the number of orders increased the average sales volume decreased, substantially. If you didn’t know that the numbers were completely made up you could reasonably conclude that they sold themselves out of business.
Enough about discretization. I just wanted to use it. I’ll go back to speaking plain GROUP BY in my next post.