search this site.

201229L - DATA COLLECTION TECHNIQUES

Print Friendly and PDFPrint Friendly

Lecture at a program “Research Methodology in Health Sciences Course” held at Northern Area Armed Forces Hospital, Hafr Al Batin on 28 December 2020 at 4.20pm. By Professor Omar Hasan Kasule MB ChB (MUK), MPH (Harvard), DrPH (Harvard) Chairman, Institutional Review Board - KFMC

 

SOURCES OF SECONDARY DATA:

} Secondary data is from decennial censuses, vital statistics, routinely collected data, epidemiological studies, and special health surveys. Census data is reliable. It is wide in scope covering demographic, social, economic, and health information. Vital events are births, deaths, Marriage & divorce, and some disease conditions.

} Routinely collected data are cheap but may be unavailable or incomplete. They are obtained from medical facilities, life and health insurance companies, institutions (like prisons, the army, and schools), disease registries, and administrative records.

} Observational epidemiological studies are of 3 types: cross-sectional, case-control, and follow-up/cohort studies.

} Special surveys cover a larger population than epidemiological studies and may be health, nutritional, or socio-demographic surveys.

 

PRIMARY DATA COLLECTION BY QUESTIONNAIRE:

} Questionnaire design involves content, wording of questions, format and layout.

} The reliability and validity of the questionnaire as well as practical logistics should be tested during the pilot study.

} Informed consent and confidentiality must be respected.

} A protocol sets out data collection procedures.

} Questionnaire administration by face-to-face interview is the best but is expensive. Questionnaire administration by telephone is cheaper.

} Questionnaire administration by mail is very cheap but has a lower response rate.

} Computer-administered questionnaire is associated with more honest responses.

 

QUESTIONNAIRE DESIGN: CONTENT:

} Correct decisions must be made about what items to include in the questionnaire guided by the hypothesis under study and knowledge of potential confounding factors.

} A start is made by reviewing questionnaire of previous similar studies.

} The content of a question may be one of the following: knowledge, attitude, belief, experience, behavior, and attributes.

 

QUESTIONNAIRE DESIGN: WORDING OF QUESTIONS:

} Main objectives in selecting questionnaire items: clarity, comprehensibility, neutrality and scaling.

} The question must be worded properly to make sure they are easy to understand.

} The wording of the questionnaire items should leave no room for ambiguity.

} The words must be easy. Technical jargon must be avoided.

} The wording should be neutral, neither positive nor negative.

} Biased questions are leading, threatening, value-laden, or assumptions.

} Each question should contain only one concept or item of information; questions should not be double barreled.

} The responses must be scaled appropriately.

} Double negatives should be avoided.

 

FORMAT AND LAYOUT:

} The order of the questions must be logical moving from the superficial to the more detailed.

} Skip patterns should be worked out carefully and exhaustively.

} Embarrassing questions should be kept towards the end because they may spoil the whole interview.

} Closed questions are preferred to open questions.

} Questions should not be too long.

} The total number of questions must be appropriate

} The questionnaire should be designed for easy reading. Use of boxes and different colors helps.

} The font print size must be readable.

 

ETHICAL AND CONFIDENTIALITY ISSUES:

} Informed consent must be obtained.

} The information provided could be subpoenaed by a court of law and the investigator cannot refuse to release it.

} In the course of the interview the investigator may get information that requires taking life-saving measures. Taking these measures will however compromise the confidentiality. Such a situation may arise in case of an interviewee who informs the interviewer that he is planning to commit suicide later that day. Such information may have to be conveyed immediately to the authorities concerned.

 

PREPARATION FOR DATA COLLECTION - 1:

} Data collection processes must be clearly defined in a written protocol which is the operational document of the study.

} The protocol should include the initial version of the questionnaire. This can be updated and improved after the pilot study.

} If a paper questionnaire is used data transfer into the electronic form will be necessary. The need for this could be obviated by direct on-line entry of data.

} The objectives of the data collection must be defined clearly. Operational decisions and planning depend on the definition of objectives. It is wrong to collect more data than what is necessary to satisfy the objectives.

} It is also wrong to collect data just in case it may turn out to be useful.

} The study population is identified. The method of sampling and the size of the   sample are determined.

 

PREPARATION FOR DATA COLLECTION - 2:

} Staff to be used must be trained. The training should go beyond telling them what they will do. They must have sufficient understanding of the study that they can detect serious mistakes and deviations.

} A pilot study to test methods and procedures should be carried out. However well a study is planned, things could go wrong once field work starts. A pilot study helps detect and correct such pitfalls.

} A quality control program must be part of the protocol from the beginning.

} Proxy or surrogate respondents must be used when the subject in handicapped or is not available. The next of kin is usually selected for this. Sometimes the subject and the proxy may disagree. In some case control studies, dead controls are selected for dead cases and proxies are interviewed for both series.

} Response can be increased by obtaining sponsorship by the government or some other    official body that credibility to the study.

} Participation is also maximized by short follow up periods, regular feedback, and causing the participants as little inconvenience as possible.

 

QUESTIONNAIRE ADMINISTRATION BY FACE-TO-FACE INTERVIEW:

} In a face-to-face interview, the interviewer reads out questions to the interviewee and completes the questionnaire. The interview may be structured or unstructured.

} The interviewer should make sure that circumstances of the interview are optimal in terms of place and time.

} The interviewers should be selected carefully and adequately trained. They should be given an interviewer’s manual to guide them. It is important that interviewers are continuously monitored.

 

ADVANTAGES OF FACE-TO-FACE QUESTIONNAIRE ADMINISTRATION:

} The interviewer can establish the identity of the respondent. In mailed questionnaire the answers may be from another person other than the intended respondent.

} There are fewer item non-responses because of the presence of the interviewer who will encourage and may coax the respondent to answer all items.

} The interviewer can clarify items that the respondent does not understand or is likely to misunderstand.

} There is flexibility in the sequence of the items.

} Open-ended questions are possible.

} Items irrelevant to the particular interviewee can be dropped thus saving time.

 

DISADVANTAGES OF FACE-TO-FACE QUESTIONNAIRE ADMINISTRATION:

} It costs more in terms of time and money. The interviewer has to travel, search for, and spend time with the respondent.

} A prior appointment is needed to ensure that the respondent will be available at the place and time of the proposed interview.

} Personal chemistry may not work well. The interviewee may resent the interviewer on the basis of gender, ethnicity, or any other personal and behavioral characteristic.

} The presence of the interviewer may influence interviewee responses in a subtle way. The interviewee may try to give responses that he thinks are acceptable to the interviewer on the basis of the interviewer's gender, race, SES, and suggestive questioning.

} The common errors in face to face interview are omitting a question, too much or too little probing, failure to record information, and cheating by the interviewer.

 

ADVANTAGES OF QUESTIONNAIRE ADMINISTRATION BY TELEPHONE:

} Considerable savings in time and money. It is possible to conduct a nationwide survey sitting in one office.

} Has fewer item non-responses because of the personal contact involved.

} Skip patterns can be followed to save time.

} Difficult questions can be explained.

} Interviewer bias is less than in face-to-face interviews.

} Telephone interviews must also be supervised for optimal results. The supervisor should listen in as the interview is conducted.

 

DISADVANTAGES OF QUESTIONNAIRE ADMINISTRATION BY TELEPHONE:

} Selection bias may operate when the study sample includes only those who have telephones and the telephone numbers are listed. The problem of unlisted numbers can be overcome by use of random digit dialing.

} Selection bias may arise due to the day and time of day that the telephone call   is placed. Office workers will be missed in early morning calls. Workers on night shifts will be missed in evening calls.

} It is not possible to be sure whether the person at the other end of the line is the actual intended respondent.

 

IMPROVEMENT OF TELEPHONE INTERVIEWS BY USE OF COMPUTERS:

} Computer-assisted telephone interview can make the process quicker when the interviewer is prompted by the computer.

} The computer will work out the skip patterns and will alert the interviewer to responses that are inappropriate or contradictory.

 

QUESTIONNAIRE ADMINISTRATION BY MAIL - 1:

} In the method of questionnaire administration by mail, a questionnaire is mailed to the respondent's address. The respondent completes and returns the questionnaire in a pre-addressed and stamped envelope.

} Questionnaire administration by mail has 2 main advantages: (a) it is the cheapest method of data collection (b) There is no bias due to interviewer involvement.

} The disadvantages of mail questionnaires are: (a) low overall response (b) Higher item non-response (c) Delays in returning the questionnaire.

 

QUESTIONNAIRE ADMINISTRATION BY MAIL - 2: Measures To Improve Response

} Sending the questionnaire with a personalized cover letter.

} Promising a token of appreciation for return of the questionnaire.

} Making the questionnaire anonymous by not including any information on the returned questionnaire that can be used to identify a particular individual.

} Provide a self-addressed and stamped envelope for the response using pre-coded questionnaires so that all the respondent has to do   is to select responses.

} Follow up by letter for those who delay in returning the questionnaires.

 

COMPUTER-ADMINISTERED QUESTIONNAIRE: Advantages

} It frees the interviewer's time.

} There are no transcription errors because information in entered on-line.

} No items are missed because the computer will not allow the respondent to move to the next item before answering the previous one.

} The respondent can give more honest responses when facing an anonymous computer than when faced by a human interviewer.

 

COMPUTER-ADMINISTERED QUESTIONNAIRE: Disadvantages

} The disadvantage of computer-administered questionnaires is that the respondent does not have the opportunity to vary the order of questions to his convenience.

 

PHYSICAL PRIMARY DATA COLLECTION:

} Data can be obtained by clinical examination, standardized psychological/psychiatric evaluation, measurement of environmental or occupational exposure, and assay of biological specimens and laboratory experiments.

} Pharmacological experiments involve bioassay, quantal dose-effect curves, dose-response curves, and studies of drug elimination.

} Physiology experiments involve measurements of parameters of the various body systems.

} Microbiology experiments involve bacterial counts, immunoassays, and serological assays.

} Biochemical experiments involve measurements of concentrations of various substances.

} Statistical and graphical techniques are used to display and summarize this data.

 

DATA MANAGEMENT - 1:

} Self-coding or pre-coded questionnaires are preferable.

} Data is input as text, multiple choice, numeric, date and time, and yes/no responses.

} In double entry techniques, 2 data entry clerks enter the same data and a check is made by computer on items on which they differ.

} Data in the computer can be checked manually against the original questionnaire.

} Interactive data entry enables detection and correction of logical and entry errors immediately.

} Data replication is a copy management service that involves copying the data and also managing the copies. Synchronous data replication is instantaneous updating with no latency in data consistency. In asynchronous data replication the updating is not immediate and consistency is loose.

 

DATA MANAGEMENT - 2:

} Data editing is the process of correcting data collection and data entry errors.

} The data is 'cleaned' using logical, statistical, range, and consistency checks.

} All values are at the same level of precision (number of decimal places) to make computations consistent and decrease rounding off errors.

} The kappa statistic is used to measure inter-rater agreement.

} Data editing identifies and corrects errors such as invalid or inconsistent values.

} Data is validated and its consistency is tested.

 

DATA MANAGEMENT 3: Main Data Problems

} Missing data,

} Coding and entry errors,

} Inconsistencies,

} Irregular patterns,

} Digit preference,

} Out-liers,

} Rounding-off / significant figures,

} Questions with multiple valid responses,

} Record duplication.

 

DATA MANAGEMENT: Data Transformation

} Data transformation is the process of creating new derived variables preliminary to analysis.

} Mathematical operations such as division, multiplication, addition, or subtraction; mathematical transformations such as logarithmic, trigonometric, power, and z-transformations.

 

DATA ANALYSIS:

} Data analysis consists of data summarization, estimation and interpretation.

} Simple manual inspection of the data is needed before statistical procedures.

} Preliminary examination consists of looking at tables and graphics.

} Descriptive statistics are used to detect errors, ascertain the normality of the data, and know the size of cells.

} Missing values may be imputed or incomplete observations may be eliminated.

 

DATA ANALYSIS: Statistical Tests

} Tests for association, effect, or trend involve construction and testing of hypotheses.

} The tests for association are the t, chi-square, linear correlation, and logistic regression tests or coefficients.

} The common effect measures are: Odds Ratio, Risk Ratio, Rate difference.

} Measures of trend can discover relationships that are not picked up by association and effect measures.

} The probability, likelihood, and regression models are used in analysis.

} Analytic procedures and computer programs vary for continuous and discrete data, for person-time and count data, for simple and stratified analysis, for univariate, bivariate and multivariate analysis, and for polytomous outcome variables.

} Procedures are different for large samples and small samples.