03 Feb

Summing Data And Ordering NULL’s

I had planned to write a lazy post today.   I was going to take the military spending data from my last post, add a total to the bottom of it, and call it a day but then I decided to make it a bit harder by ordering the data.

Here is the initial query that I was going to work with.

SELECT region, SUM(spending) as totSpending
FROM spending
GROUP BY region

region		totSpending
Africa		120606
Asia		428477
Australia	28559
Europe		312410
North America	743119
South America	65976

Then all I was going to do was add a total to the data which you can do by using the WITH ROLLUP operator as follows:

SELECT region, SUM(spending) as totSpending
FROM spending
GROUP BY region WITH ROLLUP

region		totSpending
Africa		120606
Asia		428477
Australia	28559
Europe		312410
North America	743119
South America	65976
NULL		1699147

Bang, we have a total. But suppose we want to ORDER the data by the totSpending column from the largest to the smallest yet keep the grand total column at the end? If we add an ORDER BY clause on the totSpending column the grand total column will appear at the top of the data.

SELECT region, SUM(spending) as totSpending
FROM spending
GROUP BY region WITH ROLLUP
ORDER BY totSpending DESC

region		totSpending
NULL		1699147
North America	743119
Asia		428477
Europe		312410
Africa		120606
South America	65976
Australia	28559

I suppose you could get away with that but my guess is that you would still want the grand total column at the end. In some implementations of SQL you have options to change the sort order of NULLS so that they appear last rather than first. While that is handy in some cases it wouldn’t fix this problem. We need a different solution, which in this case is IIF.

SELECT region, SUM(spending) as totSpending
FROM spending
GROUP BY region WITH ROLLUP
ORDER BY IIF(region IS NULL, 1, 0), totSpending DESC

region		totSpending
North America	743119
Asia		428477
Europe		312410
Africa		120606
South America	65976
Australia	28559
NULL		1699147

What this does is order the result set in a fashion that sorts NULL values last, and everything else, in no particular order, before that. Then all we need to do is add a second column to the ORDER BY clause, totSpending, and we have the sort we’re looking for.

This could also be written using the CASE statement.  In SQL Server IIF is converted to a CASE statement so it’s functionally the same thing.  In MySQL you could use IF or CASE.  I do not know if MySQL converts IF statements to CASE statements for processing like SQL Server does.

Finally, we clean up the query by aliasing the columns and using COALESCE to rename the NULL value so that we know it’s a total column.

SELECT COALESCE(region,'Total Spending') as Continent, SUM(spending) as 'Military Spending'
FROM spending
GROUP BY region WITH ROLLUP
ORDER BY IIF(region IS NULL, 1, 0), 'Military Spending' DESC

Continent	Military Spending
North America	743119
Asia		428477
Europe		312410
Africa		120606
South America	65976
Australia	28559
Total Spending	1699147

The only other thing is that when I aliased the Military Spending column I also have to refer to it in the Order by clause with single-quotes.

27 Jan

ANSI_NULLS setting

Back in my Cyberquery days we used to search for NULL values with colname = NULL. I always thought it was strange but that’s how it worked. Well, ou can do the same thing with SQL Server by using the ANSI_NULLS setting.

SET ANSI_NULLS ON; — This is the default setting.
with cte as
(
select null as col1
union
select 1
)
select * from cte where col1 = NULL;

Returns no rows.

However, the same query with ANSI_NULLS OFF returns a different result.

SET ANSI_NULLS OFF; — Turns ANSI_NULLS off so the = operator will work.
with cte as
(
select null as col1
union
select 1
)
select * from cte where col1 = NULL;

Returns

col1
NULL

You’ll probably never run into it but if you see “= NULL”, and it works, this is what’s going on.

Note: At some point in the future this will be turned off so that this setting is always on and trying to turn it off will result in an error.

MSDN: http://msdn.microsoft.com/en-us/library/ms188048.aspx

27 Jan

COALESCE And NULL Ordering

NULL’s are a strange concept in that they equal nothing so how do you order them? In the case of SQL Server they come first. Other databases have options to sort them first or last and the ANSI standard says either is OK as long as you pick one. Here is a quick example,

SELECT null AS col1
UNION
SELECT 1
UNION
SELECT 2
ORDER BY col1; — This could be written as ORDER BY 1

col1
NULL
1
2

And that’s that. Actually the most relevant parts of this are that the alias can only apply to the first SELECT and the ORDER BY, just like in a regular query must appear last. This is all pretty standard stuff.

But, what if you wanted to put the NULL into the middle of this result set because, well, just because.

The easiest way to do that would be to ORDER BY a COALESCE statement as follows:

SELECT null AS col1
UNION 
SELECT 1 as 
UNION
SELECT 2
ORDER BY COALESCE(col1,1.5);
go

Except for the small problem that it returns the following errors,

Msg 207, Level 16, State 1, Line 6
Invalid column name ‘col1’.
Msg 207, Level 16, State 1, Line 6
Invalid column name ‘col1’.
Msg 104, Level 16, State 1, Line 6
ORDER BY items must appear in the select list if the statement contains a UNION, INTERSECT or EXCEPT operator.

Ok, that is kind of obvious. What’s interesting about that is outside of a UNION, INTERSECT or EXCEPT you can do exactly what I just tried to do and it will work. And, the ORDER BY item does, in fact, appear in the SELECT List. Fine, be that way!!!!!

So, what do we do? We use a Common Table Expression and call it a post.

WITH cte AS
(
SELECT null
UNION
SELECT 1
UNION
SELECT 2
)
SELECT *
FROM cte
ORDER BY col1

col1
1
NULL
2

I can’t really imagine using this, at least not with a NULL, but it does show how to use a SET operator with an alias or an ORDER BY and it also shows that you can order a SET by a column that isn’t in the result set, unless, you are using a UNION, INTERSECT or EXCEPT.

20 Jan

< NULL

This is from my Cyberquery days and it involves a dead inventory report. The idea was to produce a report that tracked inventory that had not been sold in the last 365 days. The system tracked the last_sale date in the sku file so I’m sure it looked easy to do. Here’s the Cyberquery code,

WHERE
last_sale < todaysdate – 365

I wound up working on the report because they wanted to change the date to 270 days. So, I changed the code to 270, looked at it for a bit, and did a “wait a second”. What happens if we never sold the item?

Well, since a NULL value can’t be less than anytihng, if the item never sold it would never show up on the report. The solution was actually pretty easy in this case because we also had a last_receipt field so all we had to do was modify the code to look something like

WHERE
last_sale < todaysdate – 270
OR (last_sale = NULL and last_receipt < todaysdate – 270)

Note: it’s been a long time since I’ve written CQ code but according to a sample that I still have that is how it tests for NULL values.

Anyways, there are two morals to this story. The first moral is be aware of NULL values in your data. They bite!

The second moral? There isn’t one. The guy who wrote the report wound up a VP in the company.

19 Jan

ISNULL

ISNULL is a pretty straight-forward function except that it doesn’t always work the same in different databases. As I tend to focus on MySQL and SQL Server here, those are the two that I’ll be looking at.

In SQL Server ISNULL tests a value to see if it’s NULL, and if it is, then it replaces the NULL value with the second parameter in the function. For Instance,

SELECT ISNULL(NULL,’I Am A Null’); returns
I Am A Null

Alternatively,

SELECT ISNULL(‘I am a value’,’I Am Not A Null’); returns
I am a value

So, its really clean, test if the first parameter is a NULL, if it is return the second, else return the first parameter.

MySQL is a bit different.

mysql> SELECT ISNULL('tttt');
+----------------+
| ISNULL('tttt') |
+----------------+
|              0 |
+----------------+
1 row in set (0.00 sec)

mysql> SELECT ISNULL(NULL);
+--------------+
| ISNULL(NULL) |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)

In other words, it only takes a single parameter and if the parameter is a NULL value then it returns True, else it returns False.

The corresponding function in MySQL is actually IFNULL and it works exactly like the NULLIF does in SQL Server.

mysql> SELECT IFNULL(NULL,'I Am A Null');
+----------------------------+
| IFNULL(NULL,'I Am A Null') |
+----------------------------+
| I Am A Null                |
+----------------------------+
1 row in set (0.00 sec)

mysql> SELECT IFNULL('I am a value','I Am Not A Null');
+------------------------------------------+
| IFNULL('I am a value','I Am Not A Null') |
+------------------------------------------+
| I am a value                             |
+------------------------------------------+
1 row in set (0.00 sec)

This is actually gets worse if you decide to divide by 0 in the first parameter,

SQL SERVER
SELECT ISNULL(1/0,’I Am Not A Null’) returns
Msg 8134, Level 16, State 1, Line 1
Divide by zero error encountered.

Which makes perfect sense, however, MySQL does the funky monkey dance,

mysql> SELECT IFNULL(1/0,'Am I really a NULL?');
+-----------------------------------+
| IFNULL(1/0,'Am I really a NULL?') |
+-----------------------------------+
| Am I really a NULL?               |
+-----------------------------------+
1 row in set (0.00 sec)

Sheesh!

Finally, if you want to do this with Oracle, the function you need is NVL.

Enjoy and three cheers for consistency.

11 Jan

Explaining NULL values

I haven’t written much in a very long time so I thought I should toss something this way even if it was originally written somewhere else.

The content below is my attempt at explaining a NULL value to a SQL beginner. Some of the semantics aren’t 100% beautiful but I think it mostly conveys the concepts and how to work with a NULL in a basic SELECT.

=============================

A field in SQL can consist of two things, a value, or an unknown.

A value is something like a letter, number, combination of those, an image, even a space, and even, just to make it confusing, an empty field.

An unknown, designated with a NULL, means that the field has no value that we can know. NULL is a marker for that kind of field. The reason an empty field is not a NULL is because we know that the field is empty.

When comparing values and the NULL designation there are three possible combinations: a value vs a value, a value vs a NULL and a NULL vs a NULL. Lets look at each of these.

Value vs. Value (Can return true or false)
1 = 1 — True
1 = 2 — False
1 <> 1 — False
1 <> 2 — True
” = 1 — False

Value vs. Null (Always false)
1 = NULL — False
1 <> NULL — False
” <> NULL — False as empty does not, not equal NULL
” = NULL — False as empty does not equal NULL

NULL vs. NULL (Always false)
NULL = NULL — False
NULL <> NULL — False

In other words, comparing an unknown, to anything else whether it be known, or unknown, will always have a result of unknown.

If we try to add a NULL, to either a value or another unknown, the result will also always be unknown. For example:

1 + NULL = NULL
NULL – 1 = NULL
1 * NULL = NULL
1 / NULL = NULL
NULL / 1 = NULL
NULL + NULL = NULL

A NULL, because it’s unknowable, is always unknowable even if you add a known value to it. So how do you deal with a NULL in data? By Testing if a field “IS NULL” (unknown), or “IS NOT NULL” (known/value).

1 IS NULL — False
1 IS NOT NULL — True
NULL IS NULL — True
NULL IS NOT NULL — False

In a way, you have two things going on with the data: known values, which work one way, and unknowns, which work another way and are tested for in a different way.

=============================

Also, something I didn’t include in the post, is what happens with SET operators. SET operators work in an opposite fashion so that a NULL will equal a NULL. For example in SQL Server:

SELECT NULL
UNION
SELECT NULL

will return a single NULL value, however, in theory, it should return two NULL values because a NULL cannot equal a NULL.

SELECT NULL
INTERSECT
SELECT NULL

Should return nothing but it will, in fact, return a single NULL value.

I don’t know why this is done this way, I’ll have to do some research one of these days, but as far as NULL values go, just know that it is so.