Fact #6 - 'Conventional' Matchkeys Lie - Don't ever trust one!
Single Persistent Matchkeys
With conventional solutions, matchkeys are tied to the ‘specific’ matching rule.
In other words, your ENTIRE matching logic (finding, indexing, and scoring) is tied to your matchkey! The matchcode that you create determines your results!
On the surface this seems logical right?
Wrong. Matchkeys fail because all of the fuzziness of the matching and scoring is built into the matchkey. Either you found it with that matchkey or you didn’t. Plus, it’s only grading similarity between your matchkey values - not the actual data. A bad search criteria (aka matchcode) yields poor results.
Additionally, with conventional matchkeys, you’re locked into your matchcode.
For data analysts and business intelligence applications, this means creating new matchcodes and matchkeys to analyze different dimensions of your data. Individual relationships, multi-buyer relationships, and household relationships, all require a new Matchcode and regenerated keys!
Change your data - change your Matchkeys. Change your Matchcode - change your matchkeys! The matchkey should be only one part of the equation. Matchkeys should be used to identify records for detailed comparison and grading - and not used for the ‘determination’ of a match.
Matchkeys are Binary {Y/N}
The matchcode that you create determines your results. It’s all or nothing. As you can see from the example above - it doesn’t work. Unless you visually inspect EVERY match - you either accept all the matches or none of them.
Most data quality vendors would rather not put themselves in the “precarious position of grading matchcodes”. These platform providers “would prefer to let their users evaluate their matchcodes and determine their own level of confidence, and to figure out for themselves what matchcode combinations are best”.
That statement is straight out of one vendor’s documentation. Talk about passing the buck!
When grading is provided - conventional solutions only measure the similarity of the matchkey values - not the data in the contact record. Without scoring – match keys are black or white, with no shades of gray.
Myopic Matching - Without scoring – matchkeys are black or white, with no shades of gray.
Multiple Iterative (aka Concurrent) Matchkeys
There is no question that a Matchcode with properly extracted, transformed and standardized data will find many correct matches. But because there is no ability to overcome all of the errors and variations in data, single persistent matchkeys will fail and will miss a lot of good matches.
Understanding the limitations of using a single matchkey, conventional solutions sought another way around this.
The solution was to have users develop a Matchcode strategy, that uses up to 16 or so different Matchcode combinations ranked from tightest to loosest.
Simplified Concurrent Matchcode
Condition #1 ZIP (5) + Last Name (5) + Street # (4) + Street Name (4)
Condition #2 ZIP (5) + Last Name (5) + PO Box (10)
The match-rule is executed from top to bottom. If a record is not matched with the first rule then the second rule is executed. If a record does not match with the second rule then it will execute the third rule and so on. When the job has run all of the rules, they return a status code stating which matchcode combinations a record hit on.
In this scenario “any one rule” that meets your matchkey criteria for any two records will return as a match. These records all match each other!
Iterative Matchkeys are Still Binary {Y/N}
Why this Fails…
There is no indication of confidence between one record or another. Instead, It simply saying “hey I found something!” And I found more of it here, here and here. Think about it - think about the issue with just one matchkey - imagine increasing that problem by a factor of 16!
You have more indicators - but the matchkeys results are still binary Y/N. You are still left in the position of trying to determine if they are correct.
Iterative match keys on the surface seem like a really a good idea. I mean, as my father would say, “if at first you don’t succeed, try-try again”. Now that I think about, the other thing my father said was “isn’t doing the same thing over and over and expecting a different result - the definition of insanity?
Before we go further, we need to dive into Fuzzy Matching Algorithms, as they are an important component of matching, and to understanding why matchkeys aren’t getting the job done.
360Science INTELLIGENTLY Scores Matches!
Conventional solutions do not!
Accuracy matters!
226%
Independent testing by one of our partners reported, 360Science Matching Engine delivered up to a 226% more accurate match rate on customer CRM data than competing solutions.