Slowly Changing Dimension Type 2 is used for tracking historical data by maintaining a version of records. CD type 2 tracks the data history by creating multiple versions of records and using either date or flag to identify the active record. This method allows tracking any number of histories as each time new record is inserted in table with its version identifier.
Lets take an example as we discussed earlier in Type 1
| CompanyID | CompanyName | CompanyLocatino | is_active |
| 1 | ABC | Maharashtra | 0 |
| 2 | ABC | Karnataka | 1 |
In the same example, if the Company moves to Karnataka, the table would look like this:
| CompanyID | CompanyName | CompanyLocation | Start_Date | End_Date |
| 1 | ABC | Maharashtra | 1-Jan-00 | 21-Dec-04 |
| 2 | ABC | Karnataka | 22-Dec-04 |
in above example is_active is the version identifier which helps to identify the current Location, however instead of using flag dates can be used which can also help to find out number of period the record was active.
NULL end date can be used to identify the current active record, instead of using NULL if some standard date is used e.g. 1111-11-11 this column can be used for an Index
SCD Type 2 Advantages:
This allows to keep all historical information.
Disadvantages:
Maintaining huge history with SCD 2 can make your datawarehouse huge and it should be used after evaluating business use cases and data requirement for reporting.
SCD 2 ETL transformations are bit complicated to develop however now a days most of the commercial tool are coming up with wizards to develop SCD Type 2
Did you enjoy this article? Please subscribe to Email or RSS to receive all the FREE updates!
Related posts: