Introduction
The migration program reads in the source recruitment data and transforms this into a migrated ClinSpark dataset for initialization of the customer’s ClinSpark instance.
Customers need to perform review on the conversion process to ensure correctness of its operation and the validity of the output. While the source code for the migration program itself is not available for customer review, the migration process produces an output specifically designed to give customers visibility into the internal operation and outputs of this program.
Sheets in the QC Import Report
The Import Report is an excel file with a number of different sheets to support review.
Conversion Logic - An itemization of all volunteer data fields in ClinSpark which receive migrated data, and details about the logic used to populate them
Conversion Notes - Informational notes about the migration of specific data fields
Conversion Warnings - Any unexpected data issues such as source data not meeting an expected format or data which the program does not know how to handle are recorded here
Load Report - A listing of aggregate counts of various kinds of data extracted from the source recruitment database
Import Report -A listing of aggregate counts of various kinds of data imported into ClinSpark
Conversion Logic
This sheet is designed to give customers insight into the specific rules and conversion logic used to transform and load their source data into ClinSpark. Each ClinSpark volunteer data field which receives migrated output gets a row in this table.
Target Table - The ClinSpark DB table containing the column
Target Column - The ClinSpark DB column receiving the data
Source Table - The customer DB table the value is taken from
Source Column - The customer DB column the value is taken from
Migration Type - The type of migration operation applied to this field
Notes - Clarifying notes if applicable
Migration Type typically contains the following values:
Copy - This field is copied without modification from the source field to the target
Default - This target field receives a default value which will be specified in the note
Ignored - This field is not populated
DateConversion - A date conversion is applied from source to target
Conversion - Conversion logic is applied to transform the source field to the target
Conversion Notes
As the migration program processess source data, any record which might potentially have an operation or fact of interest to a QC reviewer receives an entry here recording that fact. For instance BMI calculations will produce records here, or conversions between english and metric values. This is a fairly fine grained amount of detail, and this sheet is often quite large. However it supports a high degree of visibility into the operation of nontrivial transformation activities.
Note that rows here do not indicate an issue. These entries are to provide transparency into the operation of the migration program.
Entity - The business meaning of the subject of this note. Could be a field name or higher level concept.
Source ID - The primary key of the volunteer record. Note this key will be the same in BOTH the source DB and ClinSpark
Source Value - The value of the field in the source, without any conversion
Target Value - The migrated value of the field in ClinSpark
Notes - Notes where applicable
Conversion Warnings
As the migration program processess source data, any data field which has any sort of issue or fact which may be of interest to a reviewer gets a row in this sheet. These entries merit careful review during the QC process, as they provide visibility into how the migration program handles issues such as duplicate phone numbers, unexpected formatting of source data, unexpected source values.
Entity - The business meaning of the subject of this note. Could be a field name or higher level concept.
Source ID - The primary key of the volunteer record. Note this key will be the same in BOTH the source DB and ClinSpark
Source Value - The value of the field in the source, without any conversion
Target Value - The migrated value of the field in ClinSpark
Notes - Notes where applicable
Import Report
This sheet provides a set of aggregate count values giving an overview of what the migration program found in the source data. It is intended to provide QC reviewers with “sanity check” values. For instance, it will report the number of medical conditions it found, which may assist a reviewer who can compare that to the number of conditions they know to exist in the source database.
Values here can be customized at the request of QC reviewers. If there are particular facts about migration program’s view of source data which can assist in review, additional rows can often be added to provide visibility into this. A good example in the below graphic is the number of source volunteers excluded for various business reasons. Each of these will typically have a corresponding record in the Conversion Warnings sheet, supporting in-depth review during QC.
Aggregate Count Label
Count - the number of records matching this row criteria
Load Report
This sheet provides a set of aggregate count values showing what was imported into ClinSpark. This differs from the Import Report sheet in that these counts are purely in terms of ClinSpark tables and entities actually imported.
This sheet is intended to support an additional level of sanity checks. A QC reviewer with an expectation that X medical conditions would be imported would be able to spot a discrepancy directly by reviewing this table.
This table combined with the Import Report is intended to support some level of accounting review. For instance, the import report will show how many total source volunteers were found, and separate counts for volunteers excluded for various reasons. A reviewer should be able to look at the total count found, subtract the excluded counts, and it should match the total number of volunteers imported as shown in the Load Report.
Aggregate Count Label
Count - the number of records matching this row criteria