01 Jun

SAS: Views

This post will be similar to the earlier one on indexes.  Views look fairly clean in SAS and, for the most part, I think I know what is going on here, so I’m only going to hit parts where I feel some doubt, however small.  As such, this won’t be the most useful post for someone, who isn’t me, to read.

Creating a view in SAS is pretty much like every other view you’ve created.

libname mysql '....';

ods listing;

proc sql;
CREATE VIEW myview AS
	SELECT name, age, sex, height, weight
        FROM mysql.admit;
quit;

proc sql outobs=5;
SELECT * FROM myview;
quit;

Name                 Age  Sex    Height    Weight
-------------------------------------------------
Murray, W             27  M          72       168
Almers, C             34  F          66       152
Bonaventure, T        31  F          61       123
Johnson, R            43  F          63       137
LaMance, K            51  M          71       158

You can display the definition of a view by using DESCRIBE VIEW which will write the view’s definition to the log.

proc sql;
DESCRIBE VIEW myview;
quit;

28         DESCRIBE VIEW myview;
NOTE: SQL view WORK.MYVIEW is defined as:

        select name, age, sex, height, weight
          from MYSQL.ADMIT;

SAS offers up a handful of suggestions on how, and when to use a view.

  • They recommend against using an ORDER BY (you can do it) because other users of the view may want to sort it differently.
  • If the data is reused, a lot, they recommend creating a table instead of a view.
  • If the table structure changes a view can become unusable.

You may have noticed that I did not use a LIBREF in the view examples I created earlier.  SAS, actually, recommends that you do that if the view resides in the same resides in the same library as the table.  In my case, they’re both in WORK.

You can assign a LIBREF in the view and not use a separate LIBREF statement.  For example:

proc sql;
CREATE VIEW myview AS
	SELECT name, age, sex, height, weight
    FROM mysql.admit
	USING LIBNAME mysql '....';
quit;

I think just knowing that, that can be done, would be enough for the test.

You can, just like in a regular database, UPDATE, DELETE or INSERT into a view, in the right situations. Per SAS,

  • You can only update a single table in a view.  You cannot update a query that has a join or a subquery.
  • You cannot update a derived column.
  • You can update a view with a WHERE clause.  The book doesn’t state this but I would assume that inserts would also be restricted by the where clause, which will act like a constraint in a view.
  • You cannot update a view that has a HAVING, ORDER BY or GROUP BY clause.

You get rid of a view the old-fashioned way.

25         DROP VIEW myview;
NOTE: View WORK.MYVIEW has been dropped.

One last note, you cannot join a view.

I’m kind of scared when it comes to views because the book’s chapter on them is really short. I’m afraid that I’m missing things. At the same time, this book has a different feel to it compared to the SAS Base exam. Base felt really petty, even in the book’s sample questions, but I get the sense that the scope here is large enough that they “may” not hammer as much minutiae on this exam.

I guess I’ll have to take it to find out.

01 Mar

Creating An Aggregate View With A Clustered Index

An aggregate view can be an excellent alternative to summary tables, especially with a clustered index. However, there are a few quirks that show up in SQL Server so I thought I’d create a simple one and plow through a litany of errors.

SELECT column_name, data_type, character_maximum_length
FROM information_schema.columns
WHERE table_name='militaryspending';

column_name	data_type	character_maximum_length
country	        varchar	        50
spending	float	        NULL
gdpperc	        float	        NULL
region	        varchar	        30

Our first try.

CREATE VIEW vw_spending AS
SELECT region, SUM(spending)
FROM militaryspending
GROUP BY region;

Msg 4511, Level 16, State 1, Procedure vw_spending, Line 2
Create View or Function failed because no column name was specified for column 2.

Alright, we need an alias for the aggregate.

IF object_id('vw_spending','V') IS NOT NULL
	DROP VIEW vw_spending;
GO
CREATE VIEW vw_spending AS
SELECT region, SUM(spending) as totSpending
FROM militaryspending
GROUP BY region;
GO
SELECT *
FROM vw_spending

region			totSpending
Africa			120608
Asia			428477.7
Australia		28559.1
Europe			312412.2
North America	        743120.4
South America	        65976.9

OK, that part worked, but we also need to add a clustered index if we want all the benefits of an aggregate view.

CREATE CLUSTERED INDEX idx_vwspending ON vw_spending (region);

Msg 1939, Level 16, State 1, Line 9
Cannot create index on view 'vw_spending' because the view is not schema bound.

Oops, OK, time for some schemabinding and recreating the view.

IF object_id('vw_spending','V') IS NOT NULL
	DROP VIEW vw_spending;
GO
CREATE VIEW vw_spending 
WITH SCHEMABINDING AS
SELECT region, SUM(spending) as totSpending
FROM militaryspending
GROUP BY region;
GO
CREATE CLUSTERED INDEX idx_vwspending ON vw_spending (region);

Msg 4512, Level 16, State 3, Procedure vw_spending, Line 3
Cannot schema bind view 'vw_spending' because name 'militaryspending' is invalid for schema 
binding. Names must be in two-part format and an object cannot reference itself.
Msg 1088, Level 16, State 12, Line 1
Cannot find the object "vw_spending" because it does not exist or you do not have 
permissions.

Two errors!!!! Come on!!!!!

THis is really easy to fix, despite it being two errors, we just need to add the schema to the table name.

IF object_id('vw_spending','V') IS NOT NULL
	DROP VIEW vw_spending;
GO
CREATE VIEW vw_spending 
WITH SCHEMABINDING AS
SELECT region, SUM(spending) as totSpending
FROM dbo.militaryspending
GROUP BY region;
GO
CREATE CLUSTERED INDEX idx_vwspending ON vw_spending (region);

Msg 1941, Level 16, State 1, Line 1
Cannot create nonunique clustered index on view 'vw_spending' because only unique 
clustered indexes are allowed. Consider creating unique clustered index instead.

Alright, alright, I’ll add make the index unique.

IF object_id('vw_spending','V') IS NOT NULL
	DROP VIEW vw_spending;
GO
CREATE VIEW vw_spending 
WITH SCHEMABINDING AS
SELECT region, SUM(spending) as totSpending
FROM dbo.militaryspending
GROUP BY region;
GO
CREATE UNIQUE CLUSTERED INDEX idx_vwspending ON vw_spending (region);

Msg 10138, Level 16, State 1, Line 1
Cannot create index on view 'WingTip.dbo.vw_spending' because its select list 
does not include a proper use of COUNT_BIG. Consider adding COUNT_BIG(*) to select list.

Seriously? We have to add a made up aggregate, to add an aggregate? Yep!

I did a quick Google on this but I couldn’t determine why it’s necessary. The best explanation I found, and it could certainly be true, is that the COUNT_BIG aggregate helps the optimizer keep track of the number of rows in the view. Why not a regular COUNT? I don’t know. The only different between COUNT, besides this example, is that one returns a bigint and the other a regular old int. The msdn for BIG_COUNT doesn’t say a thing about it.

Anyways, one more time, with feeling.

IF object_id('vw_spending','V') IS NOT NULL
	DROP VIEW vw_spending;
GO
CREATE VIEW vw_spending 
WITH SCHEMABINDING AS
SELECT region, SUM(spending) as totSpending, COUNT_BIG(*) AS countBig
FROM dbo.militaryspending
GROUP BY region;
GO
CREATE UNIQUE CLUSTERED INDEX idx_vwspending ON vw_spending (region);

Msg 8662, Level 16, State 0, Line 1
Cannot create the clustered index "idx_vwspending" on view "WingTip.dbo.vw_spending" because 
the view references an unknown value (SUM aggregate of nullable expression). Consider 
referencing only non-nullable values in SUM. ISNULL() may be useful for this.

This can be fixed in a couple of ways. First, you can set the spending column to not accept NULL values. I was lazy bringing it over so I allowed it to accept NULL. Or, you could do what the error says and add ISNULL to the query which is what I’m going to do here.

IF object_id('vw_spending','V') IS NOT NULL
	DROP VIEW vw_spending;
GO
CREATE VIEW vw_spending 
WITH SCHEMABINDING AS
SELECT region, SUM(ISNULL(spending,0)) as totSpending, COUNT_BIG(*) AS countBig
FROM dbo.militaryspending
GROUP BY region;
GO
CREATE UNIQUE CLUSTERED INDEX idx_vwspending ON vw_spending (region);
GO
SELECT *
FROM vw_spending;

region			totSpending	countBig
Africa			120608		39
Asia			428477.7	28
Australia		28559.1		3
Europe			312412.2	36
North America	        743120.4	9
South America	        65976.9		11

And there ya go. Finally, we have an aggregate view with a clustered index off of a casually designed table.