12 May

SAS: PROC SQL Operators

This is more a note than a post but in PROC SQL you get to do things a bit differently here.  Here are some of the differences.

You can use SAS operators

For example, ge, lt, ^=, and whatever I’m forgetting, are valid in a WHERE statement.

proc sql;
	SELECT *
	FROM mysql.admit
    WHERE sex ne 'M'
    AND age ge 30
    and age lt 40;
quit;

                                                                    Act
ID    Name            Sex       Age      Date    Height    Weight  Level      Fee
---------------------------------------------------------------------------------
2462  Almers, C       F          34         3        66       152  HIGH    124.80
2501  Bonaventure, T  F          31        17        61       123  LOW     149.75
2552  Reberson, P     F          32         9        67       151  MOD     149.75

IS NULL and IS MISSING

According to the manual these are the same thing. As SAS does not use NULL values, in exactly the same sense as SQL, I assume they represent these as either a period or empty space as they did in SAS Base.

CONTAINS (?)

If I remember my Cyberquery, and I probably don’t, this was part of that language as well. In a sense, it’s a LIKE with percentage symbols on both ends of it. For example.

proc sql;
	SELECT *
	FROM mysql.admit
	WHERE name CONTAINS 'er'
	or name ? 'Pi';
quit;

                                                                    Act
ID    Name            Sex       Age      Date    Height    Weight  Level      Fee
---------------------------------------------------------------------------------
2462  Almers, C       F          34         3        66       152  HIGH    124.80
2552  Reberson, P     F          32         9        67       151  MOD     149.75
2563  Pitts, D        M          34        22        73       154  LOW     124.80
2568  Eberhardt, S    F          49        27        64       172  LOW     124.80
2572  Oberon, M       F          28        17        62       118  LOW      85.20
2574  Peterson, V     M          30         6        69       147  MOD     149.75
2578  Cameron, L      M          47         5        72       173  MOD     124.80
2579  Underwood, K    M          60        22        71       191  LOW     149.75
2586  Derber, B       M          25        23        75       188  HIGH     85.20

As you can see with the ‘Pi’ it is case-sensitive.

=* SOUNDEX

I’ve never had to use SOUNDEX but, in effect, it’s used to select results that are like the search target, or more accurately, sound like the search target. I don’t know the algorithm that SAS, or even the other databases use. Here’s a, not so good, example of SOUNDEX in action.

proc sql;
	SELECT *
	FROM mysql.admit
    WHERE actlevel =* 'hig';
quit;

                                                                    Act
ID    Name            Sex       Age      Date    Height    Weight  Level      Fee
---------------------------------------------------------------------------------
2458  Murray, W       M          27         1        72       168  HIGH     85.20
2462  Almers, C       F          34         3        66       152  HIGH    124.80
2544  Jones, M        M          29         6        76       193  HIGH    124.80
2571  Nunnelly, A     F          44        19        66       140  HIGH    149.75
2575  Quigley, M      F          40         8        69       163  HIGH    124.80
2586  Derber, B       M          25        23        75       188  HIGH     85.20
2589  Wilcox, E       F          41        16        67       141  HIGH    149.75

Most databases use SOUNDEX but I don’t remember the =* syntax anywhere else.

08 Apr

GROUP BY, SAS vs SQL Server vs MySQL

Anyone who isn’t confused really doesn’t understand the situation.
Edward R. Murrow

It looks like I’m going to talk myself into doing the SAS ADVANCED certification.  I was going through the sample questions for the first chapter and I ran into the following: What happens when you use a GROUP BY without an aggregate function..  Well, unfortunately, this depends on the database but I’m pretty sure that SAS expects a specific answer.  So, I’m going to go through SAS, MySQL and SQL Server in regards to ordering with a GROUP BY.

SQL Server

SELECT * INTO myjunk 
FROM
(
  SELECT 'A' AS Name, 1 AS Number
  UNION
  SELECT 'C', 3
  UNION 
  SELECT 'B', 2
) newTable;

SELECT Name, SUM(number) AS Number
FROM myjunk
GROUP BY name;

Name	Number
A	1
B	2
C	3

SELECT Name, Number
FROM myjunk
GROUP BY Name;

Msg 8120, Level 16, State 1, Line 1
Column 'myjunk.Number' is invalid in the select list because it is not contained in
either an aggregate function or the GROUP BY clause.

As you can see SQL Server doesn’t like it when you use a GROUP BY without an Aggregate.

MySQL

mysql> CREATE TABLE myjunk AS
    -> SELECT *
    -> FROM
    -> (
    ->   SELECT 'A' AS Name, 1 AS Number
    ->   UNION
    ->   SELECT 'C', 3
    ->   UNION
    ->   SELECT 'B', 2
    -> );
ERROR 1248 (42000): Every derived table must have its own alias
mysql> CREATE TABLE myjunk AS
    -> SELECT *
    -> FROM
    -> (
    ->   SELECT 'A' AS Name, 1 AS Number
    ->   UNION
    ->   SELECT 'C', 3
    ->   UNION
    ->   SELECT 'B', 2
    -> ) as newTable;
Query OK, 3 rows affected (0.10 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT Name, SUM(number) AS Number
    -> FROM myjunk
    -> GROUP BY name;
+------+--------+
| Name | Number |
+------+--------+
| A    |      1 |
| B    |      2 |
| C    |      3 |
+------+--------+
3 rows in set (0.00 sec)

mysql> SELECT Name, Number
    -> FROM myjunk
    -> GROUP BY Name;
+------+--------+
| Name | Number |
+------+--------+
| A    |      1 |
| B    |      2 |
| C    |      3 |
+------+--------+
3 rows in set (0.00 sec)

mysql> SELECT Name, Number
    -> FROM myjunk
    -> GROUP BY Name DESC;
+------+--------+
| Name | Number |
+------+--------+
| C    |      3 |
| B    |      2 |
| A    |      1 |
+------+--------+
3 rows in set (0.00 sec)

Ah, MySQL is much more fun. Not only does it turn the GROUP BY into an ORDER BY, but, you can use the DESCENDING keyword, and it works. That’s good stuff.

SAS

The Study Guide says that SAS will treat the GROUP BY as an ORDER BY, so here goes.

LIBNAME mysql "....";

ods listing;

PROC SQL;
      SELECT age, height, weight
      FROM mysql.admit
      GROUP BY height;
QUIT;

     Age    Height    Weight
----------------------------
      31        61       123
      28        62       118
      43        63       137
      22        63       139
      49        64       172
      43        65       123
      44        66       140
      34        66       152
      32        67       151
      41        67       141
      40        69       163
      30        69       147
      35        70       173
      51        71       158
      54        71       183
      60        71       191
      27        72       168
      47        72       173
      34        73       154
      25        75       188
      29        76       193

So, it does work. I tried the DESC keyword but it didn’t work. So, once again, 3 completely different approaches to the language.

  • SQL Server hard stops a GROUP BY without an aggregate function.
  • MySQL lets you not only order with a GROUP BY, but, you can use the DESC keyword.
  • SAS lets you order with it but you can’t use the DESC keyword.

Interesting side note, and it means maybe I missed something DESC comes up in the auto-complete drop down for SAS in the GROUP BY.  So, maybe it works?  Or, maybe since I’m working with a .dat file in the sample data, and not, say, SQL Server, the behavior is different?