20 May

SET Operators: EXCEPT

I was thinking this would be hard going into it, but, it’s actually really easy, except, that EXCEPT doesn’t exist in SQL Server.

EXCEPT is very simple.  What it does is return every unique row in the first result set, that is not in the second result set.  Here are two very simple examples:

SELECT 1 AS r1,2 AS r2,3 AS r3
EXCEPT
SELECT 3,4,5;

r1	r2	r3
1	2	3

SELECT 1 AS r1,2 AS r2,3 AS r3
EXCEPT
SELECT 1,2,3;

r1	r2	r3

Those are simple queries but they show how the EXCEPT operator works. Now, what happens if we have records that are duplicates in the first table? Well, this happens:

SELECT 1 AS r1,2 AS r2,3 AS r3
UNION ALL
SELECT 1,2,3
r1	r2	r3
1	2	3
1	2	3

(
SELECT 1 AS r1,2 AS r2,3 AS r3
UNION ALL
SELECT 1,2,3
)
EXCEPT
SELECT 3,4,5

r1	r2	r3
1	2	3

That’s really clean compared to most of my posts, and, it’s going to stay that way. The reason? I’m not going to mess with SAS and I’m going to let MySQL go right ahead not doing it. If you decide to do it in MySQL, and one of these days, I’ll do it, you need to get a bit creative with a JOIN or a couple of other options such as a correlated subquery and NOT EXISTS.

OK, I talked myself into it, but only because I literally just did this the other day.

First, I’ll create a new table that has just the records where sex = ‘M’

mysql> CREATE TABLE allMale AS SELECT * FROM junk WHERE sex = 'M';
Query OK, 10 rows affected (0.11 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> SELECT name, sex, age FROM allMale;
+--------------+------+------+
| name         | sex  | age  |
+--------------+------+------+
| Murray, W    | M    |   27 |
| LaMance, K   | M    |   51 |
| Jones, M     | M    |   29 |
| King, E      | M    |   35 |
| Pitts, D     | M    |   34 |
| Peterson, V  | M    |   30 |
| Cameron, L   | M    |   47 |
| Underwood, K | M    |   60 |
| Derber, B    | M    |   25 |
| Warren, C    | M    |   54 |
+--------------+------+------+
10 rows in set (0.00 sec)

Maybe I shouldn’t have called these tables junk?

Anyways, here is a query in SQL Server using the EXCEPT operator.

SELECT name, sex, age
FROM junk
EXCEPT
SELECT name, sex, age
FROM allMale

name	        sex	age
Almers, C	F	34
Bonaventure, T	F	31
Eberhardt, S	F	49
Ivan, H	        F	22
Johnson, R	F	43
Nunnelly, A	F	44
Oberon, M	F	28
Quigley, M	F	40
Reberson, P	F	32
Takahashi, Y	F	43
Wilcox, E	F	41

Now, for the MySQL version.

mysql> SELECT DISTINCT name, sex, age
    -> FROM JUNK j
    -> WHERE NOT EXISTS (SELECT name FROM allMale WHERE j.name = name AND j.
age = age AND j.weight = weight);
+----------------+------+------+
| name           | sex  | age  |
+----------------+------+------+
| Almers, C      | F    |   34 |
| Bonaventure, T | F    |   31 |
| Johnson, R     | F    |   43 |
| Reberson, P    | F    |   32 |
| Eberhardt, S   | F    |   49 |
| Nunnelly, A    | F    |   44 |
| Oberon, M      | F    |   28 |
| Quigley, M     | F    |   40 |
| Takahashi, Y   | F    |   43 |
| Ivan, H        | F    |   22 |
| Wilcox, E      | F    |   41 |
+----------------+------+------+
11 rows in set (0.00 sec)

I had to write that about 10 times because I kept forgetting to change the SELECT clause. Yeah, for me, SELECT = ‘HARD’, EXISTS = ‘Easy’.

The only part of this query that really matters is the WHERE clause. Essentially it uses a correlated subquery to compare the two data sets and where the record NOT EXISTS in allMale, it is returned to the result set. It’s not very clean, and all things being equal, I’d rather use EXCEPT.

OK, the SELECT matters too.  I have to use the DISTINCT there so the query won’t return duplicate values, or, as the documentation I’ve read for EXCEPT says “unique values only”.

And just for kicks, Oracle doesn’t have an EXCEPT operator either. Over on that part of the ranch, they roll with MINUS.

Fortunately, I’m done with EXCEPT, except, until I do the “SAS is special” operator article.