1470 - Replicating IDENTITY Fields – Software Support

Product:	Syniti Data Replication (Syniti DR, fka DBMoto)
Version:	All
ID:	1470
Summary:	Replicating IDENTITY Fields

IDENTITY fields are numeric fields that are auto-generated by the DBMS when a new record is inserted. The following two conditions usually characterize an identity field:

(a) it is not writable;
(b) its value is unknown until the new record has been inserted.

These two properties can make identity fields quite difficult to use as part of a replication. The complications can be divided into several different cases.

Identity fields defined only in the source table (in refresh or mirroring replications)
Identity fields defined also in the target table (in refresh or mirroring replications)
Identity fields defined also in the target table (in synchronization replications)

Case 1: Identity fields defined only in the source table (in refresh or mirroring replications)

The identity field, usually defined to represent a primary key column in a source table that does not have any other primary key, can be used in a replication like any other field. You can decide whether to add a matching column in the target table (where the identity will also be a primary key), or simply not to map the identity to the target. Remember that, in mirroring replications, a primary key is needed in the target table and generally this means creating a matching primary key column that will receive the values of the identity in the source.

Because the identity is specified only in the source table, the conditions (a) and (b) above don’t affect the replication logic.

Source Table		Target Table
[PK] [IDENTITY] ID	>	[PK] ID
COL1	>	COL1
COL2	>	COL2

Case 2: Identity fields defined also in the target table (in refresh or mirroring replications)

For one-way replications (refresh or mirroring), you can decide to exclude the identity field from the mapping to the target table so that transactions will not to insert or update any value in that field, thereby avoiding a violation of condition (a) above. This means that you will need to select another field as the primary key field in the target table, and this field will have to be mapped to a field in the source table.

Source Table		Target Table
[PK] [IDENTITY] ID	>	[PK] [IDENTITY] ID
COL1	>	[PK] COL1
COL2	>	COL2

Case 3: Identity fields defined also in the target table (in synchronization replications)

This is the most complex situation, because usually identity fields (which generally have different seeds in the source and in the target table and generate different values) need to match in a two-way replication (synchronization). The decision to define an identity field to be used as primary key both in the source and in the target table can be brought about from the need to include an identity column without consideration for its value. Replicating an identity field in a synchronization replication would violate condition (a). A possible scenario to avoid violating condition (a) is:

Source Table		Target Table
[PK] [IDENTITY] ID	<>	[PK] [IDENTITY] ID
[PK] COL1	<>	[PK] COL1
COL2	<>	COL2

where no mapping is applied between the identity fields in either direction. This solution can be adopted if 1) another column (COL1), or set of columns defined or identifiable as a primary key, exists in both tables and 2) the identity fields do not need to be kept synchronized. This solution basically ignores the identity fields in each table and they will assume valid and non matching values.

As an alternative to this solution (also an alternative to the solution proposed for case 2), many databases offer the possibility to temporarily disable constraint (a) on identity fields, making them available for writing. This option allows you to map a column to an identity field and to force the value of the identity field to match the one in the source table:

Source Table		Target Table
[PK] [IDENTITY] ID	<Identity off >	[PK] [IDENTITY] ID
COL1	<>	COL1
COL2	<>	COL2

However, you should consider that, because the identity field is automatically generated, there would not be full control over its value in new records eventually being inserted. This solution could trigger a high number of conflicts when the synchronization attempts to resolve record insertions in the source and in the target table. Also, the format and range of data of the identity type in one DBMS does not always match those of the other DBMS.

In general, it is not advisable to use identity fields in synchronization replications, relegating to the DBMS the task of uniquely identifying a record with the goal of synchronizing a source and a target table because it does not provide a sure way to keep source and target tables synchronized on a matching primary key column.

Related articles