Presented at the CIP program at KSU on January 01, 2020 by Prof Omar Hasan Kasule Sr Professor of Epidemiology and Bioethics, King Fahad Medical City.
DATA ENTRY:
- Self-coding or pre-coded questionnaires are preferable.
- Data is input as text, multiple choice, numeric, date and time, and yes/no responses.
- In double entry techniques, 2 data entry clerks enter the same data and a check is made by computer on items on which they differ.
- Data in the computer can be checked manually against the original questionnaire.
- Interactive data entry enables detection and correction of logical and entry errors immediately.
DATA REPLICATION:
- Data replication is a copy management service that involves copying the data and also managing the copies.
- Synchronous data replication is instantaneous updating with no latency in data consistency.
- In asynchronous data replication the updating is not immediate and consistency is loose.
DATA EDITING:
- Data editing is the process of correcting data collection and data entry errors.
- The data is 'cleaned' using logical, statistical, range, and consistency checks.
- All values are at the same level of precision (number of decimal places) to make computations consistent and decrease rounding off errors.
- The kappa statistic is used to measure inter-rater agreement.
- Data editing identifies and corrects errors such as invalid or inconsistent values.
DATA VALIDATION: MAIN DATA PROBLEMS:
- Missing data,
- Coding and entry errors,
- Inconsistencies,
- Irregular patterns,
- Digit preference,
- Out-liers,
- Rounding-off / significant figures,
- Questions with multiple valid responses,
- Record duplication.
DATA TRANSFORMATION:
- Data transformation is the process of creating new derived variables preliminary to analysis mathematical operations such as division, multiplication, addition, or subtraction;
- Mathematical transformations such as logarithmic, trigonometric, power, and z-transformations.