18 May

GROUP BY, Again, This Time It’s Final, No Really It Is.

I think this will be the last of these.  I’ve gotten four posts out of GROUP BY and this time I’m going to double-dip, with the HAVING clause.  So, lets start with SQL Server and blow things up.

SQL Server

SELECT actlevel AL, SUM(height) AS 'sumHeight'
FROM junk
GROUP BY AL
HAVING SUM(height) > 400;

Msg 207, Level 16, State 1, Line 3
Invalid column name 'AL'.

SELECT actlevel AL, SUM(height) AS 'sumHeight'
FROM junk
GROUP BY actlevel
HAVING sumHeight > 400;

Msg 207, Level 16, State 1, Line 4
Invalid column name 'sumHeight'.

Both of these errors are saying that you can’t use an alias in a GROUP BY or HAVING clause. Honestly, this is kind of the expected behavior if you understand how SQL theoretically works (another post).

However, the other two guys I write about are just fine with it.

MySQL and SAS

mysql> SELECT actlevel AL, sum(height) as mysum
    -> FROM junk
    -> GROUP BY AL
    -> HAVING mysum > 400;
+------+-------+
| AL   | mysum |
+------+-------+
| HIGH |   491 |
| LOW  |   465 |
| MOD  |   477 |
+------+-------+
3 rows in set (0.00 sec)

proc sql;
SELECT actlevel AL, sum(height) AS mysum
FROM mysql.admit
GROUP BY AL
HAVING mysum > 400;
quit;

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, ',', -, '.', 
/, <, <=, <>, =, >, >=, ?, AND, AS, CONTAINS, EQ, EQT, FROM, GE, GET, GT, GTT, LE, LET, LIKE, LT, 
LTT, NE, NET, OR, ^=, |, ||, ~=.

Ah, SAS, the 4th grade comedian that drove the teacher crazy. SAS requires the AS keyword.

proc sql;
SELECT actlevel AS AL, sum(height) as mysum
FROM mysql.admit
GROUP BY AL
HAVING mysum > 400;
quit;

AL       mysum
--------------
HIGH       491
LOW        465
MOD        477

So, that’s all good. I can end now, right. I mean, I wouldn’t be tempted to do something exotic, just to see what SAS, or maybe even MySQL, does with it, would I? Nope, not me.

mysql> SELECT actlevel 'AL', sum(height) as mysum
    -> FROM junk
    -> GROUP BY 'AL'
    -> HAVING mysum > 400;
+------+-------+
| AL   | mysum |
+------+-------+
| HIGH |  1433 |
+------+-------+
1 row in set (0.00 sec)

Err, OK. Maybe this will be better.

mysql> SELECT actlevel AL, sum(height) as 'mysum'
    -> FROM junk
    -> GROUP BY AL
    -> HAVING 'mysum' > 400;
Empty set, 1 warning (0.00 sec)

Oops! MySQL, ever the professionals at making it clear what went wrong. But, I wonder how SAS will handle this code.

proc sql;
SELECT actlevel 'AL', sum(height) as mysum
FROM mysql.admit
GROUP BY 'AL'
HAVING mysum > 400;
quit;

AL       mysum
--------------
HIGH      1433
HIGH      1433
LOW       1433
MOD       1433
LOW       1433
HIGH      1433
MOD       1433
MOD       1433
LOW       1433
LOW       1433
HIGH      1433
LOW       1433
MOD       1433
HIGH      1433
MOD       1433
LOW       1433
MOD       1433
HIGH      1433
LOW       1433
HIGH      1433
MOD       1433
Okey, dokey.

And,
proc sql;
SELECT actlevel AL, sum(height) as 'mysum'
FROM junk
GROUP BY AL
HAVING 'mysum' > 400;
quit;

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, ',', -, 
'.', /, <, <=, <>, =, >, >=, ?, AND, AS, CONTAINS, EQ, EQT, FROM, GE, GET, GT, GTT, LE, LET, 
LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.

Lets try this, instead:

proc sql;
SELECT actlevel AL, sum(height) as 'mybum'
FROM junk
GROUP BY AL
HAVING 'mybum' > 400;
quit;

ERROR: Expression using greater than (>) has components that are of different data types.

Really? No, Really? SAS is just a giant box of endless posts.

Well, I guess it has to be this way.

proc sql;
SELECT actlevel AS AL, sum(height) AS mybum
FROM mysql.admit
GROUP BY AL
HAVING mybum > 400;
quit;

AL       mysum
--------------
HIGH       491
LOW        465
MOD        477

Or, I suppose, you could just write it like you are supposed to.

proc sql;
SELECT actlevel AS AL, sum(height) AS mybum
FROM mysql.admit
GROUP BY actlevel
HAVING sum(height) > 400;
quit;

AL       mysum
--------------
HIGH       491
LOW        465
MOD        477

I just wish I could trust SAS to write the SAS Advanced certification that cleanly but after the Base I simply don’t trust them not to use all of this stuff.

I’m out!

03 Mar

HAVING Without GROUP BY

Yes, it’s possible.

This is something that not a lot of people know about, but, it turns out that you can, in fact, use HAVING without a GROUP BY.

From the official documention,

“When GROUP BY is not used, HAVING behaves like a WHERE clause.”

http://msdn.microsoft.com/en-us/library/ms180199.aspx

And, that is it for the documentation. Essentially from what I can tell it only works in situations where you have aggregate data. For example.

SELECT SUM(spending) as totSpending
FROM militaryspending
HAVING SUM(spending) > 200000;

totSpending
1699154.3

And,

SELECT SUM(spending) as totSpending
FROM militaryspending
HAVING SUM(spending) < 200000;

totSpending

If I were to change the SELECT to region, SUM(spending) the query fails.

SELECT region, SUM(spending) as totSpending
FROM militaryspending
HAVING SUM(spending) < 200000;

Msg 8120, Level 16, State 1, Line 5
Column ‘militaryspending.region’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Finally, you can have multiple aggregates in the function.

SELECT SUM(spending) AS totSpending, AVG(spending) AS AvgSpending
FROM militaryspending
HAVING SUM(spending) > 200000;

totSpending	AvgSpending
1699154.3	13485.3515873016

This also works in MySQL.

mysql> SELECT SUM(spending)
    -> FROM militaryspending
    -> HAVING SUM(Spending) > 100000;
+---------------+
| SUM(spending) |
+---------------+
|     1699254.3 |
+---------------+
1 row in set (0.03 sec)

Really, while the official manual says it acts like a WHERE that’s really just another way of saying it acts like a HAVING clause with a single row in the result set.

Enjoy.