mSQL - msp_FindOverlap - overlap matches and large_clusters tables – Software Support

Overlap matches table

This table contains the matching pairs that have been identified as a result of the fuzzy matching process.

Column	Description
ID	Record ID for each matching pair.
Record1	Reference ID of the first record in the matching pair.
Record2	Reference ID of the record that is the second record in the matching pair. This record belongs to the second datasource specified in the parameters.
Level	The Level column indicates the matching level(s) at which a match was found. If it contains a 1 then the two records match at the Individual level; if 2, then Family level; if 4, then Household level; and if 8, then Business level. Multiple levels are indicated by summing values – for example, 9 would indicate a match at both Individual and Business levels (1+8), and 15 a match at all four levels (1+2+4+8). By default, the Level column is followed by the total score for the four matching levels. These columns are fully configurable within a configuration file. Component scores (for name, organization, address, etc.) can also be output for any level(s).
IndividualScore	Individual level total match score.
FamilyScore	Family level total match score.
HouseholdScore	Household level total match score.
BusinessScore	Business level total match score.
These following columns relate to either master record identification or bridging prevention and shouldn’t be used for any other purpose; they are subject to change in future versions of mSQL.
MatchFlags	The MatchFlags column is only used when Bridging Prevention is enabled, which is only applicable to an internal dedupe, so disregard this column when running findoverlap.
MasterPriority1	Used for Master Record Identification.
MasterPriority2	Used for Master Record Identification.
AddressLength1	Used for Master Record Identification.
AddressLength2	Used for Master Record Identification.
Key	Indicates through which match key (as specified in your configuration file) this matching pair was found.

large_clusters table

This table lists the clusters that contain too many records (i.e. the Maximum Cluster Size has been exceeded). Processing the cluster will therefore be skipped to avoid the stored procedure potentially requiring a significant amount of processing time.

Column	Description
ID	Record ID
KeyIndex	The composite key being processed.
Search	The current composite key value that identifies the current cluster (for example, the value of mkPostOut+mkName1 – note that a pipe character separates each key value).
Records	The total number of records in the cluster.
MaxRecords	The maximum cluster size constant.

Related articles