There is a “Slowly Changing Dimension” transform in SSIS that I will write about at some point but this article is about the concept of slowly changing dimensions from the BI / Data warehouse angle.
A slowly changing dimension is a dimensional attribute that can change over time. The first time I ran into a SCD was with our customer data (we’d built a big pile of sales data). I ran a GROUP BY on the customer number and customer name and the same customer showed up twice. The customer had changed its name and the GROUP BY (technically, this was a Cyberquery, so it was actually a SUM) was returning two results for the same customer. I made a relatively quick decision, the last record in, wins, and unbeknownst to us I had implemented a Type 1 SCD.
Examples of SCD’s include
- A customer moves to a new location.
- A sales representative changes territories.
- Supplier information changes.
- A product description is updated.
Slowly changing dimensions are maintained in a numbers ways. The approach to each is known as a type and for this article there will be 5 (0 to 4). Lets look at each one.
This is the Kim Kardashian of slowly changing dimensions in that it just sits there, looks pretty, and doesn’t do a thing. In other words, you don’t change or account for any changes in the dimension.
This is the current trend in Hollywood, or so the purists whine, you don’t have a new idea, so you reboot the old one. In other words, when the data changes, you overwrite the old data with the new data. For instance,if a customer name changes then you change their name in the dimension to the new name.
This approach is, somewhat, akin to a movie sequel. You have the first movie, then a second and then a third. In a data warehouse sense, it means that you create a new record every time a dimension changes. Typically you would track that information with a start date and end date column so you have a history of when a dimension was a specific value. Type 2 SCD’s are usually maintained with a start and ending date but can also be versioned.
A type 2 SCD example:
|1||150004||Way Out BI||2004-01-01||2004-12-31|
|2||150004||BI Done Right||2005-01-01|
This is a like a time travel movie where our hero meets future and past versions of himself. The sequel and the prior movie all take place in a single movie. That’s a lousy example!
A type 3 SCD uses columns in the same table to track changes. In my earlier example, instead of two rows you have a single row with a current, and at least one past record. It would look something like this:
|1||150004||BI Done Right||2005-01-01||Way Out BI|
This is not an approach that I would be happy to see.
This is your museum. With this approach you keep the current record in the dimension table and any earlier records are kept in a history table with a date field to track changes.
There are, apparently, still more approaches to SCD’s but for the purposes of the 70-463 test I can’t imagine them popping up. If they do, well, here’s to beating the odds.
This article was very helpful.