Previous Article | matchIT SQL Index | Next Article |
Power BI User Guide
The Power BI Report can be configured in both SSIS and stored procedures. Please click here for details on using Power BI in stored procedures
To enable your processing results in Power BI for a job, you will need to use the MISQL PowerBIReport component. This component must be run before all other MISQL SSIS tasks, for example:
In this example the PowerBIReport component will be run before all the MISQL SSIS tasks.
The next step is to configure the PowerBIReport component, you can do this by double clicking on the component or right clicking and choosing the Edit option.
Below you can find information on all the options:
Connection String: This is the location of the table that will store all the information from your Job (We recommend you use the matchIT_SQL database since this is already configured in the report)
Power BI Job Name: This is where you can set the name of your Job, or add to a previously run job. This is the name that will appear in Power BI.
Make Power BI Job Name Unique: If you enable this option then each time the PowerBIReport component is run it will append the date time to your Job Name.
Delete Job: Here you can delete previous jobs that you have processed
Once you have configured the PowerBIReport component, you can now run your package and a new table will be created in the database you specified in the connection string. This table contains the information from your job that Power BI will then read.
Using the Power BI Reports
Once you have successfully run a job with a PowerBIReport component you are now ready to view the results in Power BI. You first need to load up the Power BI report. Once this has loaded you will need to press the “Refresh” button on the tool bar, we advise selecting the (default) Job first, this is so that Power BI doesn’t load in data from a previous Job.
Once the refresh has completed you should see your Job in the Job Name slicer. If you select this Job you will get a brief summary of the job, how many GenerateKeys task, how many GenerateCorrectedAddresses tasks and so on.
If you now press the refresh button again, the data for this job will begin to load into Power BI, the time this takes will obviously depend on how much you are importing in. Once the refresh has completed you are ready to view the data in Power BI.
Below you can find a description of all the pages currently available. We recommend you read the “Power BI Tips and Tricks” section below so that you can get the full Power BI experience.
In the top left of every report there is an option to select the table you want to view. There should be an entry in the slicer for each task of a certain type you have run. You can click on the different entries to see the results for that tasks, or you can click a combination of them (ctrl and left click) to see the combined results for the selected tasks. This can be great if you have data from multiple sources, with different data structures and hence have run multiple, for example generate keys tasks, but want to see the combined results. (If you do not select any tasks then it will show the results from all the tasks combined)
Data Quality Report
This page will give you information about the quality of your source data. This includes the breakdown of your postcode/zip (e.g. invalid postcodes, breakdown of the forename data). The breakdown of prefix (e.g. using the supplied prefix, generated prefix). The breakdown of gender (e.g. Female, Unknown). And finally the breakdown of the forename (Empty, not found).
Quality Scores
This page will give you the information about the quality of your source data using the quality scores outputted from the MIAPI.
Quality scores give you information about how “good” your data is. Currently there are quality scores for Name, Address and Email. If you not have any of these fields mapped up then every record will get a score of 0. Please note that in order to get the Quality scores you must select to output them in the generate keys task.
Data Extraction and Potential Data Errors
This page will give you the Data Extraction results and any potential Data Errors. Data extraction includes Premise, Town, County and Country. Potential data errors include, records excluded (this is because the record contains words from our names table, for example contains deceased in the record), street populated, address populated and name populated.
Address Verification
This page will give you the Address Verification results from running the GenerateCorrectedAddressing task. With these results you can see the breakdown of the Pafflag field (this is how well the Addressing task managed to validate the records)
This page also allows you to see your addresses on a map. You can then select options at the top (e.g. addresses that have changed address1/street, changed Postcode/Zip) to see that subset of data on the map. The points on the map are done at postcode/zip level. So addresses with the same postcode/zip will occupy the same point, but the point will be larger to show there are multiple addresses there. The points also are pie charts to show the breakdown of the Pafflag field for when multiple addresses occupy the same spot on the map but have different Pafflag values. For example:
This point is in fact 28 different records that have the same zip code. But the point is a pie chart that contains the pafflag information. So by hovering over the partial section of the point we can find out that there is 15 records that got a partial result when being addressed at this zip.
Please note that Power BI has a cap on the number of addresses that cap be displayed on a map, so you might not be able to see every address.
Internal Matching Report
This page will give you internal matching result from running our FindExactMatches, GroupExactMatches, FindMatches and GroupFindMatches tasks. It allows you to filter the matching results by keys and level.
On the left hand side you can select your matching level (if you have run multiple levels of matching, and have run multiple levels of grouping), changing this will give you the different matching pairs and groups for that level. For example I have selected Household so I get the household results:
And then when I change to Individual I get the Individual results:
Please note Exact Matching Pairs will not change when you change the matching level because exact matches are independent of matching level.
In the bottom right of the page there is a bar chart showing the matching pairs by score. The Count of Pair column illustrates how many matching pairs there are at that score. You may find that you have to use the scroll bar in the matching pairs visual when you have a high number of distinct scores.
In the top left of the report there is a slicer for Key, you can use this to look at your matching pairs by the different matching keys you have used. This allows you to see the matches that are being found by each of the matching keys.
In the bottom left hand side of this page there is a visual showing you the matching groups from this find matches task. The Count of Group column illustrates how many records there are in each group. Please note if you have a lot of matching groups you will need to use the scroll bar as explained above.
In the top middle of this report there is a visual that shows the exact matching pairs from this matching. You can filter this visual by your different exact keys using the slicer on the left hand side.
In the top right hand side of the page there is a visual giving you the duplication rate. The number in the middle is how many records matchIT has determined to be duplicates and the number on the right hand side is the number of total records. This number of duplicates is based on the matching level selected on the matching level slicer.
Overlap Matching Report
This page is identical to the Internal Matching Report. Please refer to the Internal Matching Report.
Here you have the option to see the number of duplicates in each file. For example above shows the number of duplicates found in example1 that are in example2.
Suppression Report
This page allows you to see the suppression results for this Job. In the middle of the screen there is a suppression rate so you can see how many record in this file were suppressed. On the bottom left hand side of the screen there is a bar chart visual where you can see the breakdown of the suppression task and see what records hit against what file.
On the top right hand side of the screen there is a quote that has been generated for this suppression task, the prices used here are from the MISQL_SuppressionPriorities table, which are set to default values when you install matchIT SQL.
In the bottom right hand side of the screen there is map that allows you to track the new address returned. Of course this map will only be available if you have used a file which outputs new addresses. The map shows all of the locations of all of the new addresses, you can then right click on an address and drill down to see the original address. This can be great for looking at new addresses that have returned as non-UK addresses.
Power BI Tips and Tricks
Power BI is a powerful product and there are a lot of additional features that you can read about at https://powerbi.microsoft.com
However below are some brief Tips and Tricks that will help you to get the most out of Power BI.
Drill Down
Drill down is a powerful feature that allows you to see the raw data that creates the visual. For example below we have the pie chart visual for Gender from the Data Quality Report.
While it is great to a have a graphical representation of the gender, we might want to see the source data that makes this chart. For example we might want to see the records that are “Inconsistent”. You can do this by right clicking on the pie chart section and selecting see records:
This will when bring up a new screen where you can see the source data records that are “Inconsistent”
On this screen you can see all of the columns that make up this table. You can remove/add columns by clicking on the fields on the right hand side. You can then export all of the data from this screen to a csv file by clicking on the 3 dots at the top right hand side of the view:
Include/Exclude
Within a visual you can include or exclude groups so you can see exactly what you want. For example below we have the Forename bar chart visual from the Data Quality Report.
We might not want to see the forename found group since this doesn't help you identify issues in your data. You can remove this group by right clicking on it and selected “Exclude”, as in the example below shows:
Once this has been pressed this group will be excluded from the visual:
You can get it the group back by going to the filter options for the visual on the fields tab, and then removing the Excluded filter:
The above can be also done with the include option, resulting in the groups kept and all others removed. You can press and hold ctrl to select multiple groups.
Focus Mode
Sometimes a visual may be too small or a group in the visual might be too small to see. Thankfully you can maximize visuals, you can do this click on the visual and then pressing the Focus Mode button in the top right corner of the visual:
After pressing this button the visual becomes full screen:
Publish
Power BI has the ability to publish reports to the web. Once published you can share reports in your domain or even create an html link you can then send to people.
In order to make use of this functionality you will need to create (or use your office 365) account:
https://powerbi.microsoft.com/en-us/get-started/
Please note that while Power BI Desktop is free, Power BI service has certain restrictions, more details below:
https://powerbi.microsoft.com/en-us/pricing/
Once you have created your account you can then publish your report to the web. You can do this by pressing the “Publish” button on the home tab in Power BI desktop, where you will be prompted to log in:
The report then be uploaded to the Power BI service, this may take some time depending on how much data you have. Please note that when you upload a report, only the current Job will be uploaded.
You can then log in to the Power BI service in your browser and see the report:
You can now send this report to other people in your organisation that have a Power BI account. They will then be able to view the report and interact with it. More information can be found here:
https://powerbi.microsoft.com/en-us/documentation/powerbi-service-share-unshare-dashboard/
You can even publish your report to the web, by going File and then Publish to web. This will give you an html link that you can then use to send to people so that they can view the report.
Previous Article | matchIT SQL Index | Next Article |