Data Validation in Data Warehouse has always been challenging. One of the most frequently way of data validation is by using MINUS/ EXCEPT operator. MINUS operator is used in ORACLE database while EXCEPT is used in SQL SERVER database. Before proceeding to testing part, first let us understand the functionality of MINUS operator.
MINUS operator gives the differences between the two table. Suppose we have two tables. Lets consider table A and B. When we will do MINUS operation between table A and table B (ie. A - B), it will return all those records which are present in table A but not in B. Similarly when we do MINUS operation between table B and table A (ie. B - A), it will return all those records which are present in table B but not in A.
One point to remember while doing MINUS operation between first table and second table is that every row in first table is searched in second table and if not found , that row is returned as a result set. Similarly, while doing MINUS operation between second table and first table is that every row in second table is searched in first table and if not found , that row is returned as a result set.
If all the rows present in first table is present in second table as well then output or result of first table MINUS second table will be empty result set or in other words nothing will be returned. Similarly, if all the rows present in second table is also present first table, then output or result of second table MINUS first table will be empty result set or in other words nothing will be returned.
Let us understand the usage of MINUS operator in data validation. Below SourceTable and TargetTable is used for illustration.
SourceTable TargetTable
ID
|
Name
|
ID
|
Name
|
|
1
|
ABC
|
1
|
ABC
|
|
2
|
DEF
|
2
|
DEF
|
|
3
|
GHI
|
3
|
GHI
|
|
4
|
JKL
|
4
|
JKL
|
|
5
|
LMN
|
5
|
LMN
|
|
6
|
OPQ
|
8
|
OPQ
|
SourceTable Minus TargetTable:
SELECT ID, NAME
|
FROM SourceTable
|
MINUS
|
SELECT ID, NAME
|
FROM TargetTable;
|
ID
|
Name
|
6
|
OPQ
|
In the above query, we are applying MINUS operator between SourceTable and Target Table. 6th row data in SourceTable is not present in TargetTable so this record will be returned from SourceTable.
TargetTable Minus SourceTable :
SELECT ID, NAME
|
FROM TargetTable
|
MINUS
|
SELECT ID, NAME
|
FROM SourceTable;
|
ID
|
Name
|
8
|
OPQ
|
In the above query, we are applying MINUS operator between TargetTable and SourceTable. 6th row data in TargetTable is not present in SourceTable so this record will be returned fromTargetTable.
Now let us consider below SourceTable and TargetTable for illustration. Here both SourceTable and Target Table is same.
Source Table TargetTable
ID
|
Name
|
ID
|
Name
|
|
1
|
ABC
|
1
|
ABC
|
|
2
|
DEF
|
2
|
DEF
|
|
3
|
GHI
|
3
|
GHI
|
|
4
|
JKL
|
4
|
JKL
|
|
5
|
LMN
|
5
|
LMN
|
|
6
|
OPQ
|
6
|
OPQ
|
SourceTable Minus TargetTable:
SELECT ID, NAME
|
FROM SourceTable
|
MINUS
|
SELECT ID, NAME
|
FROM TargetTable;
|
ID
|
Name
|
In the above query, we are applying MINUS operator between SourceTable and Target Table.Each row data in SourceTable is present in TargetTable so empty result set is returned.
TargetTable Minus SourceTable :
SELECT ID, NAME
|
FROM TargetTable
|
MINUS
|
SELECT ID, NAME
|
FROM SourceTable;
|
ID
|
Name
|
In the above query, we are applying MINUS operator between TargetTable and SourceTable. Each row data in TargetTable is present in SourceTable so empty result set is returned.
CONCLUSION: For Data validation to PASS,
(a.) Source Table MINUS Target Table should always return empty result set.
(b.) Target Table MINUS Target Table should always return empty result set.
If you like this post, don't forget to share and post comment.
Usefull
ReplyDeleteI have to say thanks for this site helping for me.DATA WAREHOUSE
ReplyDeletedesc sourse ;
ReplyDelete