Fuzzy Lookup Transform
Introduction
Real-world data is "dirty" because of misspellings, truncations, missing or inserted tokens, null fields, unexpected abbreviations, and other irregularities. Fuzzy lookup enable us to match input records with clean, standardize records in a reference table. To understand it properly let's take an example. Suppose we have customer information like customer name and address. During the sales transaction we take the input for customer name and address which may not be matched exactly with records in the customer reference table because of typographical or others error in the input data. Fuzzy lookup returns the best matching records from the customer reference table even if no exact match exists.
So the fizzy lookup is a very useful transform for every SSIS developer in the real-world environment. In this article we are going to learn about it.
How we use the Fuzzy Lookup Transform
Step-1 [ Create the Fuzzy Lookup Reference Table ]
-- Fuzzy Lookup Reference table
IF OBJECT_ID(N'tbl_FUZZYREFERENCES', N'U') IS NOT NULL
BEGIN
DROP TABLE tbl_FUZZYREFERENCES;
END
GO
CREATE TABLE tbl_FUZZYREFERENCES
(FIRSTNAME VARCHAR(50) NOT NULL,
LASTNAME VARCHAR(50) NOT NULL,
DOB DATETIME);
GO
--Inserting Records
INSERT INTO tbl_FUZZYREFERENCES
(FIRSTNAME, LASTNAME, DOB)
VALUES ('Joydeep', 'Das', '12-17-1974'),
('Shipra', 'Roy Chowdhury', '09-22-1974'),
('Deeepasree', 'Das', '01-31-2003');
GO
SELECT * FROM tbl_FUZZYREFERENCES;
FIRSTNAME LASTNAME DOB
Joydeep Das 1974-12-17 00:00:00.000
Shipra Roy Chowdhury 1974-09-22 00:00:00.000
Deeepasree Das 2003-01-31 00:00:00.000
Step-2 [ The Source Flat File ]
The flat file name is "FuzzySourceRecords.txt"
Step-3 [ Data Flow ]
Step-4 [ Fuzzy Lookup Transform Editor ]
Step-5 [ Derived Column Transform Editor ]
Step-6 [ Union All Transform Editor ]
Step-7 [ OLE DB Destination Create Table ]
Step-8 [ Running the Package ]
Step-8 [ Final Destination Table ]
Hope you like it.