When dimension changes over a period of time, it is called as Slowly Changing Dimension and it is very popularly known as SCD. It is used for tracking history of dimension attributes.There are mainly six types of SCD which are TYPE 0, TYPE 1, TYPE 2, TYPE 3, TYPE 4 and TYPE 6.
EmployeeDeatil table mentioned below is used as an illustration, contains columns Emp_Key, EmployeeID, Name and Designation. Emp_key is a Surrogate Key.
TYPE 0 (Retains original value): This is a passive method. Value remains exactly the same as that of when inserted for the first time. This is not often used.
TYPE 1 (Overwriting old value): In this type, old value is replaced by a new value in the dimension table. Old data is lost in this process.
Let us assume Employee has been promoted from Test Engineer to Test Lead. So, Designation column has been replaced with new value as shown below. Thus we are not maintaining any history of changes that has taken place.
Old Value
EmployeeDeatil table mentioned below is used as an illustration, contains columns Emp_Key, EmployeeID, Name and Designation. Emp_key is a Surrogate Key.
TYPE 0 (Retains original value): This is a passive method. Value remains exactly the same as that of when inserted for the first time. This is not often used.
TYPE 1 (Overwriting old value): In this type, old value is replaced by a new value in the dimension table. Old data is lost in this process.
Let us assume Employee has been promoted from Test Engineer to Test Lead. So, Designation column has been replaced with new value as shown below. Thus we are not maintaining any history of changes that has taken place.
Old Value
Emp_Key
|
EmpID
|
Name
|
Designation
|
111
|
1001
|
Sunil
|
Test Engineer
|
New Value
TYPE 2 (Adding new row): In this type, a complete
history is maintained by adding extra row to the dimension table. Whenever a new row is added, a new surrogate key (Emp_key) is assigned. Type 2 can be implemented by following two ways.
(a.) By adding Effective Date Columns i.e. Effective Start Date and Effective End Date .
Let us assume Employee has been promoted from Test Engineer to Test Lead.
Here all old records are marked with start date in 'Effective Start Date ' Column and end date in 'Effective End Date' Column. The most recent or active record will have 'Effective End Date ' as NULL or very high date i.e 31/12/9999
Old Value
New Value
(b.) By adding Flag Column i.e. ActiveFlag
Let us assume Employee has been promoted from Test Engineer to Test Lead.
Here all old records in the dimension table is marked with Flag =0 in 'ActiveFlag' column while one recent or active record is marked with Flag=1 in 'ActiveFlag' column as shown below.
Old Value
New Value
TYPE 3 (Adding new columns): In this type history is maintained by adding attributes/columns to the dimension table. Here only current and previous values are retained as history. New value is provided in 'New' column while old value is stored in 'old' column.
Let us assume Employee has been promoted from Test Engineer to Test Lead.
You can see 'New Designation' column is provided with new values while and 'Old Designation' column is provided with old values.
Old Value
New Value
TYPE 4 (Adding history Table): In this type, a new history table is added to keep track of all the changes of attributes in the dimension table. The main dimension table contains latest or recent records.
Let us assume Employee has been promoted from Test Engineer to Test Lead.
EmployeeDetail (Original Table)
EmployeeDetail_History (History Table)
TYPE 6 (Hybrid): It combines the approach or method used in TYPE 1, TYPE 2 and TYPE 3 (1+2+3=6).
Example of EmployeeDetail Table with one record
We see that Current_Flag value is overwritten in with the new information, as in Type 1 processing. New records are created to track the changes, as in Type 2 processing. And we store the history in Historical Value column , which incorporates Type 3 processing.
Emp_Key
|
EmpID
|
Name
|
Designation
|
111
|
1001
|
Sunil
|
Test Lead
|
TYPE 2 (Adding new row): In this type, a complete
history is maintained by adding extra row to the dimension table. Whenever a new row is added, a new surrogate key (Emp_key) is assigned. Type 2 can be implemented by following two ways.
(a.) By adding Effective Date Columns i.e. Effective Start Date and Effective End Date .
Let us assume Employee has been promoted from Test Engineer to Test Lead.
Here all old records are marked with start date in 'Effective Start Date ' Column and end date in 'Effective End Date' Column. The most recent or active record will have 'Effective End Date ' as NULL or very high date i.e 31/12/9999
Old Value
Emp_Key
|
EmpID
|
Name
|
Designation
|
111
|
1001
|
Sunil
|
Test Engineer
|
New Value
Emp_Key
|
EmpID
|
Name
|
Year
|
Designation
|
Effective Start Date
|
Effective End Date
|
111
|
101
|
Sunil
|
2012
|
Test Engineer
|
1/1/2012
|
31/12/2015
|
112
|
101
|
Sunil
|
2016
|
Test Lead
|
1/1/2016
|
31/12/9999
|
(b.) By adding Flag Column i.e. ActiveFlag
Let us assume Employee has been promoted from Test Engineer to Test Lead.
Here all old records in the dimension table is marked with Flag =0 in 'ActiveFlag' column while one recent or active record is marked with Flag=1 in 'ActiveFlag' column as shown below.
Old Value
Emp_Key
|
EmpID
|
Name
|
Designation
|
111
|
1001
|
Sunil
|
Test Engineer
|
New Value
Emp_Key
|
EmpID
|
Name
|
Designation
|
ActiveFlag
|
111
|
101
|
Sunil
|
Test Engineer
|
0
|
112
|
101
|
Sunil
|
Test Lead
|
1
|
TYPE 3 (Adding new columns): In this type history is maintained by adding attributes/columns to the dimension table. Here only current and previous values are retained as history. New value is provided in 'New' column while old value is stored in 'old' column.
Let us assume Employee has been promoted from Test Engineer to Test Lead.
You can see 'New Designation' column is provided with new values while and 'Old Designation' column is provided with old values.
Old Value
Emp_Key
|
EmpID
|
Name
|
Designation
|
111
|
1001
|
Sunil
|
Test Engineer
|
New Value
Emp_Key
|
EmpID
|
Name
|
Old Designation
|
New Designation
|
111
|
1001
|
Sunil
|
Test Engineer
|
Test Lead
|
TYPE 4 (Adding history Table): In this type, a new history table is added to keep track of all the changes of attributes in the dimension table. The main dimension table contains latest or recent records.
Let us assume Employee has been promoted from Test Engineer to Test Lead.
EmployeeDetail (Original Table)
Emp_Key
|
EmpID
|
Name
|
Designation
|
112
|
1001
|
Sunil
|
Test Lead
|
EmployeeDetail_History (History Table)
Emp_Key
|
EmpID
|
Name
|
Designation
|
Effective Start Date
|
Effective End Date
|
111
|
101
|
Sunil
|
Test Engineer
|
1/1/2012
|
31/12/2015
|
112
|
101
|
Sunil
|
Test Lead
|
1/1/2016
|
31/12/9999
|
TYPE 6 (Hybrid): It combines the approach or method used in TYPE 1, TYPE 2 and TYPE 3 (1+2+3=6).
Example of EmployeeDetail Table with one record
Emp_Key
|
EmpID
|
Name
|
Current Value
|
Historical Value
|
Effective Start Date
|
Effective End Date
|
ActiveFlag
|
111
|
1001
|
Sunil
|
Test Engineer
|
Test Engineer
|
1/1/2012
|
31/12/9999
|
Y
|
Let us assume Employee gets promotion from Test Engineer to Test Lead , we add a new record, as shown below.
Emp_Key
|
EmpID
|
Name
|
Current Value
|
Historical Value
|
Effective Start Date
|
Effective End Date
|
Current_Flag
|
111
|
1001
|
Sunil
|
Test Lead
|
Test Engineer
|
1/1/2012
|
31/12/2015
|
N
|
112
|
1001
|
Sunil
|
Test Lead
|
Test Lead
|
1/1/2016
|
31/12/9999
|
Y
|
Let us assume again Employee gets promotion from Test Test Lead to Manager, we add a new record, as shown below.
Emp_Key
|
EmpID
|
Name
|
Current Value
|
Historical Value
|
Effective Start Date
|
Effective End Date
|
Current_Flag
|
111
|
1001
|
Sunil
|
Manager
|
Test Engineer
|
1/1/2012
|
31/12/2015
|
N
|
112
|
1001
|
Sunil
|
Manager
|
Test Lead
|
1/1/2016
|
31/12/2016
|
N
|
113
|
1001
|
Sunil
|
Manager
|
Manager
|
1/1/2017
|
31/12/9999
|
Y
|
We see that Current_Flag value is overwritten in with the new information, as in Type 1 processing. New records are created to track the changes, as in Type 2 processing. And we store the history in Historical Value column , which incorporates Type 3 processing.
You put really very helpful information.DATA WAREHOUSE
ReplyDelete