Solution: DSP
Version: 6.6
Component: Collect and Assemble
I have recently worked in a request from the client that I would like to share with you as it has a few small functionality within Collect that are very useful in certain scenarios.
The requirement was:
The client has 3 tables (the same structure and similar data) in the legacy system. They want to merge the 3 tables while we collect the data into a single table in SQL.
How would you do this?
If you are thinking to create 3 process in collect, using the same Target Table but different Source table, bear in mind that the Target table has to be unique in Collect, so it will not allow you to add 3 processes with the same Target Table name.
Let’s go step by step and see how to do it.
For this example I’ve created 3 txt files;
First step is to create the Assemble packages:
Three in total:
- sdbFlatfile_Foreign_Suppliers_1.imp
- Source File Name = Foreign_Suppliers_1.txt
- Target Table Name = Foreign_Suppliers
- sdbFlatfile_Foreign_Suppliers_2.imp
- Source File Name = Foreign_Suppliers_2.txt
- Target Table Name = Foreign_Suppliers
- sdbFlatfile_Foreign_Suppliers_3.imp
- Source File Name = Foreign_Suppliers_3.txt
- Target Table Name = Foreign_Suppliers
Next create the Collect process:
To create the process you need to specify the Target Table name and the Package Type (manual cranport as it is coming from a flat file):
Next, you specify the package name:
Now, when we try to add a new process that use the same Target Table name an error occur:
This is because the Target Table name has to be unique.
Then, how do you add multiple sources into a single table? We need to use one of the advance properties, Table Rename. This displays the table name of the renamed table in target database. New table name must contain [TableName]. At run time, [TableName] is replaced with processing table name.
This allows us to create multiple process with different names and different assemble packages but all of them working with the same table in SQL.
But you have probably notice that this approach has a big issue. Didn’t you?
This is the result of Collect:
What is the total number of records in the output table Foreign_Suppliers in SQL?
- A: 95
- B: 112
- C: 3
The correct answer is C, 3. this is because Collect delete the data and insert the new one in every process. We could have selected to do not delete the data, but them, we will be appending data every day.
What do we do?
There is one more advance property that will helps us, Target Delete Where Clause (in the assemble package). If Target Delete Where Clause is not blank and Target Delete Records is checked, the where clause will be used when deleting records from the target table. "WHERE" should not be entered.
In our text file we have a column that specify whether is from the table 1, 2 or 3:
If we change our configuration in the Assemble Package to delete only the data from the corresponding file:
Will do the same for TableName=2 and TableName=3.
Now, when we execute Collect we see the 112 rows that we were expecting:
Please note that even though in Foreign_Suppliers_1 has 95 entries, we will see 112 that is the sum of the 3 of them (95+14+3).
I hope you have found this post interesting.
Please do not hesitate to ask any questions.
Comments
0 comments