Introduction
The matchIT API is a component written by Syniti for state of the art fuzzy matching, data formatting and data cleansing – among its most common uses are duplicate prevention, inquiry, deduplication and merge/purge. The matchIT API splits and cases names and addresses, generates match keys, and grades matching records. The component provides a compact and efficient solution to the problems of data quality and duplication on any Windows based system. The matchIT API is also available in common variants of Unix and Linux, using shared libraries.
This is the help file for the COM version of the matchIT API. This document assumes that you have familiarity with at least one Windows-based programming language. Experience with the utilisation of COM components from within programs would be an advantage, but not essential. If you have any questions, please contact us and we will be glad to help you.
Overview
There are two fundamental parts to the matchIT API:
- record generation
- record comparison.
These can be utilised in five different scenarios:
- data entry not connected to the target database e.g. in web forms
- on-line lookup e.g. customer inquiry
- data capture incorporating duplicate prevention
- single file matching e.g. deduplication
- cross file matching e.g. data load.
Looking at these scenarios, it is obvious that the first scenario, data entry not connected to the target database, does not require record comparison. Therefore, for formatting, standardization and screening of new data independent of data that you already hold, you only need to incorporate the record generation aspect in your application. This will allow you to e.g. parse names into their component parts, relocate floating data to fixed fields, standardise abbreviations and expansions and allow you to screen for garbage and abusive entries.
The remaining scenarios require both record generation and record comparison to be effective. Record generation allows you to group records intelligently for comparison, by generating phonetic and non-phonetic match keys for each record. Record comparison grades each pair of matching records with a match score, which allows you to fine tune the matching and give the user control of the level of matching that they use.
Because understanding the use of match keys and match scores is fundamental to effective use of the matchIT API for the last four scenarios listed above, we will next describe how and why they are used.