31 May

SAS: Indexes

This site needed a photo..

The site needed a photo..

This is a very broad overview of indexes in SAS.  Honestly, for the most part, indexes seem to work like you would expect indexes to work and they aren’t complicated at least in the sense that I’m looking at them.

And, just a reminder, I write these as Study Guides for me, that means that I won’t, usually, go step-by-step in these things.  That will, especially, be the case here.

SAS has a handful of guidelines for creating indexes.  They are:

  • Use fewer indexes to save on storage and update costs.
  • Don’t create indexes for small tables.  In those cases, sequential access, otherwise known as a table scan, is faster.
  • Create indexes on columns that have a large number of distinct values.
  • Use indexes when a relatively small number of rows will be returned.  They specifically state “less than 15%”.

Creating indexes is, almost, straight-forward.  The syntax is

CREATE  INDEX  ON table (column(s))

The only hiccup with indexes is that an index must be named the same as the column, if, you only index a single column. In other words, you can name composite indexes but not a simple index. For example:

CREATE INDEX idx_myjunk ON work.myjunk (number);

ERROR: For a simple index, the index name must be the same as the column name.

On the other hand, here are 3 successfully created indexes.  It’s very, very, very, unlikely that you would create all 3 of these indexes on a table.  However, I haven’t been able to determine if SAS will use a partial composite index, say, if you queried the number1 column.  My guess, based on what I’m reading, and the way I think things work, is that it would not.  But, don’t take that as gospel.  I guess all of that means that I could see you doing it in a specific scenario of some sort.

proc sql;
CREATE TABLE work.myjunk
(
	number int,
	number1 int
);
CREATE INDEX number ON work.myjunk (number);
CREATE UNIQUE INDEX number1 ON work.myjunk(number1);
CREATE INDEX idx_composite ON work.myjunk(number, number1);
DESCRIBE TABLE work.myjunk;
quit;

From the log.

28         DESCRIBE TABLE work.myjunk;
NOTE: SQL table WORK.MYJUNK was created like:

create table WORK.MYJUNK( bufsize=131072 )
  (
   number num,
   number1 num
  );
create index idx_composite on WORK.MYJUNK(number,number1);
create unique index number1 on WORK.MYJUNK(number1);
create index number on WORK.MYJUNK(number);

That is a big values-add on my part.  It’s also OK to create a composite index with the same name as a simple index, however, you cannot name a composite index with the same name as a column in the index.

You can tell if SAS is using an index by using the MSGLEVEL= option. You use it as follows:

options msglevel= i or n;

Yes, it’s a global option and it will stay in effect for the session. There are two setting for message level:

  • N: Displays notes, errors and warnings.  This is the default.
  • I:  Displays additional information regarding index usage, sorts and merges.

You can force SAS to use a specific index or to process indexes in a specific fashion.  To tell SAS to use, or not use an index by using IDXWHERE.  It has two options:

  • IDXWHERE=No: Tells SAS to not use an index and instead to do a sequential scan.
  • IDXWHERE=Yes: Tells SAS to use an index and to not consider a sequential scan.

You use IDXWHERE as follows:

proc sql;
SELECT *
FROM work.myjunk (idxwhere=no)
WHERE number < 5;
quit;

You can force SAS to use a specific index with the IDXNAME= option. You simply name the index and use it the same place you use IDXWHERE as follows:

proc sql;
SELECT *
FROM work.myjunk (idxname=number)
WHERE number < 5;
quit;

And, once again, I’d like to point out that I bring incredible value to the process because the WHERE clause would pretty much guarantee that SAS used that index.

One other note on IDXNAME is that the index you specify must exist, and, it must have its first, or only, column match the condition in the WHERE clause.

SAS indexes are like a clustered index, in a very loose sense, in that they are sorted, in ascending order. However, you cannot create a SAS index in descending order, and unlike a true clustered index, because you can have multiple indexes on a table, the underlying data is not sorted as you would have in a clustered index. Frankly, they aren’t a clustered index, I just wanted to mention them in this article because I think the cool kids would do that.

Finally, you can delete an index as follows:

28         DROP INDEX number, number1 ON myjunk;
NOTE: Index number has been dropped.
NOTE: Index number1 has been dropped.

I figured it would be easier to post the log on this one. As you can see it’s acceptable to drop multiple indexes on a table. You just separate each index name with a comma.

I probably skipped a few things but it looks like, at least at a basic level, that indexes are pretty straight-forward in SAS.

Leave a Reply

Your email address will not be published. Required fields are marked *