DTFSQLite cartridge overview
DTFSQLite cartridge allows to read DTF and AddressBase Premium gazetteers, including full uploads and change only updates.
It supports and requires the following parameters provided in the connection string:
- Data Source – which points to the folder with a collection of DTF or AddressBase Premium csv files followed by the asterisk *, e.g. data source=”C:\GazetteerToUpload\*”
- SQLite Database – which indicates the location of a SQLite database which will be used during merging process. The merging process combines multiple tables into a single gazetteer table.
Please note that:
- It is expected that SQLite Database file does not exist but its directory is already created.
- If the file already exists, it will be deleted.
Format type (DTF or ABP) is automatically determined based on version encoded in the first line of csv files.
Source file upload order
The source files (.csv) are processed in the following order:
- All full upload files are identified
- All COU files older then full uploads are ignored
- Only the newest upload per custodian code is taken for further processing with exception of 9999 custodian code which marks extracts from National Hub. All uploads from National Hub are processed.
- All records from selected files (Full uploads + COUs) are then segregated into appropriate tables.
- All but the newest records are removed from the aforementioned tables.
DTF specific comments
SQLite Database directory will be also used to store missingUprns.txt and missingUsrns.txt which contain lists of UPRNs and USRNs existing in LPI table but missing from Street Descriptors or BLPUs tables.
In addition to fields described in DTF specification, the final gazetteer table contains the following columns:
- CROSS_REFERENCE – contains aggregated XREF_KEY, END_DATE, CROSS_REFERENCE, SOURCE. Individual fields are separated by comma characters and multiple entries are separated by semicolon.
- LOGICAL_STATUS_DESCRIPTION – contains logical status of LPI record as text
- BLPU_CLASS_1ST – contains the first letter of BLPU_CLASS (useful for thematic mapping)
- BLPU_CLASS_2ND – contains the first two letters of BLPU_CLASS (useful for thematic mapping)