Previous Article | matchIT SQL Index | Next Article |
4.8.2.1 msp_OutputOverlapMatchingPairs
Input Parameters:
- Configuration file – the file path of the configuration file to be used when this procedure is run.
- Main datasource ID – specifies the data source to be used within the configuration file, which contains database connection string and table/column mappings.
- Overlap datasource ID – specifies the second data source to be used within the configuration file, which contains database connection string and table/column mappings.
After running msp_GroupOverlap, this will output all record pairs that match between the two datasources to the ‘matching_pairs’ table, ordered by the Score column.
Setting |
Description |
outputSettings->matchingPairsTable |
The Overlap attribute specifies the name of the overlap matching pairs table that will be produced. If the Overlap attribute is empty, then the name attribute will be used. |
outputSettings->matchesTable |
The Overlap attribute specifies that name of the overlap matches table that will be required to generate the matching pairs table. If the Overlap attribute is empty, then the name attribute will be used. |
dataSources |
Specifies the datasource containing the specification for the source data that will be combined with the matches table to produce the overlap matching_pairs table. |
4.8.2.2 Overlap Matching_pairs table
Column |
Description |
Score |
Match Score for the matching pair. |
ID_1 |
Reference ID of the first record in the matching pair. |
ID_2 |
Reference ID of the second record in the matching pair which will be a record from the second datasource. |
MatchRef |
|
Note that the overlap matching_pairs table also contains the source fields for each record mapped in the datasource within the configuration file, allowing you to view the actual data that has matched.
4.8.2.3 msp_OutputOverlapMatchingGroups
Input Parameters:
- Configuration file – the file path of the configuration file to be used when this procedure is run.
- Main datasource ID – specifies the data source to be used within the configuration file, which contains database connection string and table/column mappings.
- Overlap datasource ID – specifies the second data source to be used within the configuration file, which contains database connection string and table/column mappings.
After running msp_GroupOverlap, this will output all groups of matching records that match between the two datasources to the ‘matching_groups’ table, ordered by the MatchRef column.
Setting |
Description |
outputSettings->matchingGroupsTable |
The Overlap attribute specifies the name of the matching groups table that will be produced. If the Overlap attribute is empty, then the name attribute will be used. |
outputSettings->groupedMatchesTable |
The Overlap attribute specifies that name of the matches_grouped table that will be required to generate the matching_groups table. If the Overlap attribute is empty, then the name attribute will be used. |
dataSources |
Specifies the datasource containing the specification for the source data that will be combined with the matches_grouped table to produce the matching_pairs table. |
4.8.2.4 Overlap Matching_groups table
Column |
Description |
MatchRef |
Reference ID for the matching group; this will reference a record in the second datasource that is considered to be the master record for the matching group. |
ID_1 |
ID of record from the first datasource. |
ID_2 |
ID of record from the second datasource. |
Note that the overlap matching_groups table also contains the source fields for each record mapped in the datasources within the configuration file, allowing you to view the actual data that has matched.
4.8.2.5 msp_OutputOverlapDuplicates
Input Parameters:
- Configuration file – the file path of the configuration file to be used when this procedure is run.
- Main datasource ID – specifies the data source to be used within the configuration file, which contains database connection string and table/column mappings.
- Overlap datasource ID – specifies the second data source to be used within the configuration file, which contains database connection string and table/column mappings.
After running msp_GroupOverlap, this will output all duplicate records from datasource2 that overlap with datasource1.
Setting |
Description |
outputSettings->duplicatesTable |
The Overlap attribute specifies the name of the duplicates table that will be produced. If the Overlap attribute is empty, then the name attribute will be used. |
outputSettings->groupedMatchesTable |
The Overlap attribute specifies that name of the matches_grouped table that will be required to generate the duplicates table. If the Overlap attribute is empty, then the name attribute will be used. |
dataSources |
Specifies the datasource containing the specification for the source data that will be combined with the matches_grouped table to produce the duplicates table. |
4.8.2.6 Overlap Duplicates table
This table contains the non master duplicate records following the matching process. The structure of the table is determined by the fields mapped in the datasource (i.e. it contains an ID field, but also the fields that you have mapped in the datasource).
4.8.2.7 msp_OutputOverlapDedupedTable
Input Parameters:
- Configuration file – the file path of the configuration file to be used when this procedure is run.
- Main datasource ID – specifies the data source to be used within the configuration file, which contains database connection string and table/column mappings.
- Overlap datasource ID – specifies the second data source to be used within the configuration file, which contains database connection string and table/column mappings.
This effectively produces the opposite of msp_OutputOverlapDuplicates. All records from datasource2 are output, except for the duplicate records.
Setting |
Description |
outputSettings->dedupedTable |
The Overlap attribute specifies the name of the deduped table that will be produced. If the Overlap attribute is empty, then the name attribute will be used. |
outputSettings->groupedMatchesTable |
The Overlap attribute specifies that name of the matches_grouped table that will be required to generate the deduped table. If the Overlap attribute is empty, then the name attribute will be used. |
dataSources |
Specifies the datasource containing the specification for the source data that will be combined with the matches_grouped table to produce the deduped table. |
4.8.2.8 Overlap Deduped table
This table contains the master records following the matching process. The structure of the table is determined by the fields mapped in the datasource (i.e. it contains an ID field, but also the fields that you have mapped in the datasource).
4.8.2.9 msp_TagOverlapMatchingResultsWithGroupLevel
Input Parameters:
- Configuration file – the file path of the configuration file to be used when this procedure is run.
- Datasource ID – specifies the data source to be used within the configuration file, which contains database connection string and table/column mappings.
- Level – can be Individual, Family, Household, Business or Custom.
This procedure renames any of the grouping related tables produced by the preceding four procedures and the msp_GroupOverlap Procedure. Each table is renamed by giving it a suffix matching the value passed in the level parameter.
Setting |
Description |
outputSettings->duplicates |
Overlap attribute specifies the name of the duplicates table to be renamed. If the Overlap attribute is empty then the name attribute will be used. |
outputSettings->dedupedTable |
Overlap attribute specifies the name of the deduped table to be renamed. If the Overlap attribute is empty then the name attribute will be used. |
outputSettings->matchingGroupsTable |
Overlap attribute specifies the name of the matching_groups table to be renamed. If the Overlap attribute is empty then the name attribute will be used. |
outputSettings->groupedMatchesTable |
Overlap attribute specifies the name of the matches_grouped table to be renamed. If the Overlap attribute is empty then the name attribute will be used. |
dataSources |
Specifies the datasource containing the connection string to the database. |
Previous Article | matchIT SQL Index | Next Article |