06 Mar

ENUM In SQL Server, NOT!

I’ve already forgotten what it was I was looking up but I happened to stumble across someone who was looking for an ENUM data type in SQL Server. Well, you can’t do that, not really, not really at all, but you can maybe get sort of close enough to fake it.

Enumeration, according to Wikipedia is,

An enumeration of a collection of items is a complete, ordered listing of all of the items in that collection.

http://en.wikipedia.org/wiki/Enumeration

ENUM, is a string data type with a restricted list of allowed values that are stored based on their position in the creation string, just like an enumeration.  It offers some advantages for situations where you have limited string types and it “may” be faster in some cases, according to what I’ve read, but that is out of the scope of my expertise.

The bid disadvantages with ENUM are that ORDER BY can behave in unexpected ways and it’s not number friendly. In fact, the MySQL manual recommends that you do not use numbers because it will do the funky monkey, get confused, and think you are storing index values rather than values. Yes, it’s just as confusing as I wrote it.

Anyways, lets look at an ENUM.

mysql> CREATE TABLE junk
    -> (
    -> item ENUM ('Alpha','Zeta','Alpha1','Zeta1')
    -> );
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO junk VALUES ('Alpha1'),('Alpha'),('Zeta'),('Zeta1');
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM junk;
+--------+
| item   |
+--------+
| Alpha1 |
| Alpha  |
| Zeta   |
| Zeta1  |
+--------+
4 rows in set (0.01 sec)

mysql> SELECT * FROM junk ORDER BY item;
+--------+
| item   |
+--------+
| Alpha  |
| Zeta   |
| Alpha1 |
| Zeta1  |
+--------+
4 rows in set (0.00 sec)

mysql> INSERT INTO junk VALUES ('joe');
ERROR 1265 (01000): Data truncated for column 'item' at row 1

As you can see, ENUM is a data type, that behaves strangely, unless you are enumerating. The CREATE TABLE statement limits the values allowed in the ENUM in this case to 6 items (the 4 in the ENUM, empty and NULL). When we run a SELECT against them we get them back in INSERT order, however, when we ORDER BY they are returned in the order they were entered in the ENUM creation statement. For enumeration, this makes sense, for some bloke running an ORDER BY against a column created this way, it would be some seriously bad magic.

So, can SQL Server do this? Nope, not at all. There is no functionality in SQL Server to sort data this way. I can’t even think of a good way to do it, but, I can think of a way, so here goes.

Note: I’m certain someone has done this is some better, and smarter, fashion than I’m showing.

First, we create the base table, as above with a CHECK constraint. The check constraint will limit the items allowed in a check constraint.

CREATE TABLE junk
(
	item varchar(20) NULL,
	CONSTRAINT ck_junk CHECK  (item IN ('Alpha','Zeta','Alpha1','Zeta1',''))
);

Well, this is a mess, and I haven’t even started. Here is what I’ve done.

  1. Created the item column with a varchar(20) type that allows NULL values.  ENUM allows NULL values.
  2. Create a constraint limiting the values in the item column to 5 values including the empty string.

But, that doesn’t provide a way to sort the data. According to the official documentation, our data would be sorted as follows in a ENUM.

Value    Sort
NULL     NULL
''       0
Alpha    1
Zeta     2
Alpha1   3
Zeta1    4

This means we need a second table to store the ENUM order. Here is that table.

CREATE TABLE junkSort
(
    item VARCHAR(20) NULL,
	sortorder INT,	
	CONSTRAINT ck_junkSort UNIQUE (sortorder),
	CONSTRAINT ck_junkSort1 UNIQUE (item)
);

You are probably wondering why there are two constraints here or why I didn’t use a foreign key constraint?  The reason is that a foreign key requires that the reference be unique, but ENUM columns do not have to be unique, so wave goodbye to that idea.  I put a unique constraint on both of these columns because, well, I want them both to be unique.

So, why not a primary key?  Because the ENUM accepts NULL values and sorts on  columns that don’t contain NULL values.  So, the best idea I had, in the couple of hours I’m taking to write this, was to unique both columns.

Now lets insert some data into the tables and run a query against them to match the original ENUM.

INSERT INTO junk VALUES
('Alpha'),('Alpha1'),('Zeta'),('Zeta1'),(''),(NULL);

INSERT INTO junkSort VALUES
('Alpha',30),('Alpha1',40),('Zeta',50),('Zeta1',60),('',20);

Note the lack of a NULL in the junkSort table. It’s not needed as the query will sort fine without it.

Now, the query.

SELECT j.item
FROM junk j
LEFT JOIN junkSort js
ON j.item = js.item
ORDER BY js.sortorder

item
NULL

Alpha
Alpha1
Zeta
Zeta1

And there we go, almost.

query1

Oops, lets get rid of the table scans.

CREATE CLUSTERED INDEX idx_junk ON junk (item);
CREATE CLUSTERED INDEX idx_junkSort ON junkSort (item);

query2

That’s better, especially those crystal clear JPG images I’ve just inserted.  Nothing but the best for my readers.

And, there you go, ENUM converted to SQL Server, kind of.

You can read more about the ENUM type at the MySQL manual:

https://dev.mysql.com/doc/refman/5.5/en/enum.html