- Country
- Format
- Geocode
- GKRInfo
- Match
- Parse
- Query
- Report
- Search
- Verify
- AMAS
- CASS
- SERP
Country Process
Overview
The Country process is designed to attempt to identify a country from supplied input data. It performs the following processes:
- Check the ForceCountry option. If set, use this.
- Check each of the fields specified in the CountryFields option. If a valid country is found, use this.
- Check the DefaultCountry option. If set, use this as a candidate.
- Check the address fields for potential candidates and duplicates identified through the previous processes.
Relevant Options
- ForceCountry
- CountryFields
- DefaultCountry
Format Process
Overview
The Format process is designed to output fields using the specified casing and output script.
Relevant Options
- UseSymbolicTransliteration
- OutputCasing
- OutputScript
Overview
The Geocode process adds latitude, longitude, and GeoAccuracy where possible, and is generally used with either Verify or Search.
Relevant Options
- GeocodeCountryList
GKRInfo Tool
Overview
Each GKR file has a set of parameters and corresponding values. The GKRInfo process is designed to provide information about the GKR files’ parameters. The GKR files are read from the data folder provided.
API Usage
The following code walks through API usage.
Initialize the server
lqtServer srv = lqtServer.create(); srv.init("C:\\loqate\\data"); lqtProcessList lst = lqtProcessList.create(); lqtProcessOptions opts = lqtProcessOptions.create();
Set the tool
lst.add("GKRInfo", opts);
Execute the process
lqtInputRecord rec = lqtInputRecord.create(); lqtProcessResult res = lqtProcessResult.create(); srv.process(rec, lst, res);
Dump the output from the GKRInfo
FileWriter outputFile = new FileWriter("C:\\output.txt"); PrintWriter out = new PrintWriter(outputFile); for (int record = 0; record < res.getCount(); record++) { out.print( "\n"); for (long field=0; field < res.getFieldCount(record); field++) { out.print( res.getFieldName(record, field) + ": " +"\t"); out.print( res.getField(record, field) + "\n"); } }
Output information
GKRInfo tool can be used by specifying GKRInfo as the required tool in the API. The output can be retrieved by processing the Records inside the ProcessResult structure.
The tool will enumerate the data files available in the data folder and collect all the GKRInfo parameters and their corresponding values and make them available in the ProcessResult. Each data file information will be stored in a seperate Record inside the ProcessResult structure. The most common parameters for the various data files that are output are:
- Name: Name of the data file
- DataVersion: Release version
- GKRVersion: Version number of data format
- Reference.FieldCount: The fields that have available data in the Reference data file for the specific country.
- Reference.ContentType.(): These are the actual fields.
- Reference.FieldBlank.(): This gives the count of the number of records for which the particular field is blank. This is output for each field that is relevant from the fieldCount.
- Reference.FieldDistinct.(): This gives the count of the number of records for which the particular field is unique and not blank. This is output for each field that is relevant from the fieldCount.
For example, the following are the parameters and the corresponding values for context.lfs .
- Name context.lfs
- CompressionType 0
- Context.MaxNodeSize 256
- Context.NodeLookupPos 2036
- Context.RecordLookupPos 167805
- File.Build.Context 1357346307
- File.Version.Context 1
- DataVersion 2013Q1.0
- GKRVersion 1
Match Process
Overview
The Match process is designed to find the most closely matching record or records from the relevant reference data source to the supplied componentized input data.
Relevant Options
- SuppressAdditionFields
- MaxResults
- RangeDecompose
- VerifyMatchRules (click here for syntax information)
Parse Process
Overview
The Parse process is designed to transform fully or partially unstructured address data into correctly componentized information.
Relevant Options
- ToolInfo
- ConfidenceThreshold
- SuppressFields
Query Process
Overview
The Query process is designed to perform SQL-like custom queries on the Loqate GKR. It enables searching and retrieving multiple results of specific address components that meet user-definable conditions. The query-conditions are expressed in our proprietary query-language, the syntax and semantics are described below.
Query Syntax and Semantics
A query has the following syntax:
…<(><(>QUERY-EXPRESSION <LOGICAL-OPERATOR QUERY-EXPRESSION><)> <LOGICAL-OPERATOR QUERY-EXPRESSION<)>…
The terminology used in the query syntax is described below, parenthesis may be used optionally to nest and string simple queries into compound queries:
PREFIX TAG (optional): ~ means phonetic similarity
SUFFIX TAG: % means auto-complete word, * means auto-complete phrase
The tags apply at the word level and qualify the search criteria:
QUERY-KEYWORD: <PREFIX-TAG(s)>word<SUFFIX-TAG(s)>
CONDITION-OPERATORS: LIKE (Ordered-matching), CONTAINS (Unordered-matching)
QUERY-EXPRESSION : FieldName CONDITION-OPERATOR “QUERY-KEYWORD(s)”
LOGICAL-OPERATORS: || (logical OR), && (logical AND), used to form compound queries by stringing multiple QUERY-EXPRESSIONs
Example Queries:
- Thoroughfare LIKE “Bayhill Dr” matches Bayhill Dr but does not match Dr Bayhill or Bayhill Drive
- Thoroughfare CONTAINS “Bayhill Dr” matches Bayhill Dr and Dr Bayhill but does not match Bayhill Drive
- Thoroughfare LIKE “~Payhill Dr” matches Bayhill Dr but does not match Bayhill Tr
- Thoroughfare CONTAINS “Bay% Dr” matches Dr Bayhill, Bayhill Dr and Bay Dr, but does not match Bay Drive or TromBay Dr
- Thoroughfare LIKE “Bay Dr%” matches Bay Dr and Bay Drive but does not match Bayhill Dr or Bay Drive Ct
- Thoroughfare LIKE “Bay Dr%*” matches Bay Dr, Bay Dr Ct, Bay Drive and Bay Drive Ct
-
(Thoroughfare CONTAINS “Bay%”) && (Locality LIKE “San Bruno”) matches Bayhill Drive San Bruno, Bay Dr San Bruno and Bayhill Dr San Bruno
Process Options
The query tool uses the following process options:
Table (required): Reference (GKR) table to be queried
QueryString (required): String expressing the query in the query language
OutputFields (optional): Comma separated output field-names desired. Field-names are case sensitive. Reserved keyword ALL (or null) returns all field in output.
MaxResults (optional): Max number of output records desired. Defaults to 10, valid values are 1-1000.
QueryClause (optional): DISTINCT/UNIQUE/null specifies output address component-set to be distinct, unique, or unconstrained across the output record results, respectively.
AliasPreference (optional): Specify to return, for non-rangefields –
FIRST : the first alias in fields desired in the output
EXHAUSTIVE: all combinations of every alias
UNPROCESSED (null): field-value entry as stored in the GKR
RangefieldPreference (optional): Specify to return, for rangefields –
MATCH : the queried value verified in the range
RANGE : the range containing the queried value after verification
FULL : every value in the range containing the queried value
RAW(null): the range-field containing the queried value as stored in the table
Output
The results of the query are returned in the ProcessResult object.
Report Process
Overview
The Report process is designed to generate reports during batch processing.
Supported Reports :
- Data Quality Report (DQR)
- CASS Report
- PS3553 Report
- SERP Report
Usage
Report Process can be invoked from lqtBatch by adding the following options in the command line
-r <list of reports to generate. Delimited by pipe character>
-ri <optional. Report info file passed to the report generator>
-ro <optional. Report output file name>. If no file name is specified, a default file name will be used.
Supported Report | Definition |
DQR | Data Quality Report |
CASS | CASS Stage File Report |
PS3553 | CASS summary report that may be submitted to USPS together with the mailing list that was processed using the CASS Process |
SERP | SERP address accuracy report |
e.g.
lqtBatch.exe -p v -r DQR -i c:\test\input.txt -d c:\test\data -ro myreport.txt
Verify Process
Overview
The Verify process is designed for batch environments, where the desire is to enter address information in address lines, address components or a combination of both and return the cleansed address data. Internally, the Verify process is implemented by running the Country, Parse, Match, and Format processes.
Relevant Options
- VerifyCountryList
- MinimumVerificationLevel
- MinimumMatchscore
- MinimumPostcode
- MaxResults
- ConfidenceThreshold
- SuppressFields
- SuppressAdditionFields