Previous Article | matchIT SQL Index | Next Article |
Clustered environments are diverse, in terms of their configuration, purpose, and usage, and so we have outlined a recommended setup that should be viewed as a best practice rather than definitive. For comparison, two alternative setups are also outlined.
Recommended
Summary
Because clustered environments should provide mission-critical services, we do not recommend installing matchIT SQL within such clusters.
Therefore, our recommendation is to use SQL Server Replication to process data outside of the cluster.
Setup
Configure the clustered SQL Server database to replicate to a processing machine (PROC) outside of the cluster.
Install matchIT SQL on PROC. All SQL scripts and XML files – and/or SSIS packages – are stored on PROC.
All connection strings should use the local database as the Data Source, they should not reference the SQL Server cluster itself: e.g. “Data Source=(local); Initial Catalog=contacts; Integrated Security=SSPI“.
Pros
matchIT SQL has no impact on the live database.
Cons
Additional hardware required.
First Alternative
Summary
Run matchIT SQL on a machine outside of the cluster, but connect to the live database.
Setup
Install matchIT SQL on a machine outside of the cluster. All SQL scripts and XML files – and/or SSIS packages – are stored on this machine.
All connection strings should use the SQL Server cluster as the Data Source: e.g. “Data Source=sql2012cluster; Initial Catalog=contacts; Integrated Security=SSPI”.
Pros
Simplest setup.
Cons
Utilization of network bandwidth.
Some impact on the live database.
Additional hardware required.
Second Alternative
Summary
Install matchIT SQL on each machine within the cluster.
At any one time, matchIT SQL should be run on one machine within the cluster.
Use a separate machine (or NAS drive) for storing the matchIT SQL config files and SQL scripts.
Setup
XML config files and SQL scripts for matchIT SQL will be stored on a machine (or NAS drive) outside of the cluster.
Firstly, install matchIT SQL on a machine outside of the cluster (DEV). Create a new folder on the DEV machine or NAS drive and share it, giving read/write access to a domain account (ACC). Copy the default config and scripts folders (including subfolders) from the matchIT SQL installation to the shared folder.
On the DEV machine, create either SQL scripts and XML config files or SSIS packages. Modify the paths in the config files to use the share – for example, replace ‘C:\Program Files\matchIT SQL\config\’ with ‘\\machineName\shareName\config\’. (Note that the tempFileDirectory path within the generalSettings node can continue to use the local disk.)
Connection strings should use the cluster name as the Data Source:
e.g. “Data Source=sql2012cluster; Initial Catalog=contacts; Integrated Security=SSPI“
Install matchIT SQL on each machine in the cluster (CLU). On each machine in the cluster:
Install matchIT SQL. Delete everything in C:\Program Files\matchIT SQL except Uninstall.exe, the bin folder, and the temp folder (to help prevent any possible confusion).
Modify the matchIT SQL Service to run using the above domain account (ACC):
cd /d C:\Program Files\matchIT SQL\bin
matchitsql /stop
matchitsql /uninstall
matchitsql /install:DOMAIN\USERNAME:PASSWORD
matchitsql /start
(Note that this only needs to be done once on each machine, upgrading matchIT SQL doesn’t revert the service to the Local Service account.)
SQL scripts and SSIS packages should be run from within the cluster. Only one machine should run the deduplication process at any one time. Alternatively, use multiple config files so that each machine processes a distinct part of the database.
Pros
No additional hardware required outside of the existing setup.
Cons
More complicated setup.
Which machine in the cluster to use for running matchIT SQL?
Some impact on the live database.
Previous Article | matchIT SQL Index | Next Article |