Presented at a workshop on enhancing
research and publications in universities in Eastern Africa held at Umma
University Kenya 19th December 2015 by Professor Omar Hasan Kasule
Sr MB ChB(MUK), MPH (Harvard), DrPH (Harvard)
IMPORTANCE
OF RESEARCH IN EDUCATION
- The purpose of research is not only to discover new knowledge
- Research complements the learning process
- The teacher engaged in research is intellectually active and up to date
- A student engaged in research has a thirst for knowledge
QUR’AN
AND RESEARCH 1: CONCEPTS OF RESEARCH IN THE QUR’AN
- Basic Qur’anic concepts relating to research are: intellect (aql), knowledge(ilm), fiqh, thinking(fikr), innovation and creativity. The Qur’an is not a textbook of science. It however contains many verses that train the mind to observe, analyze, think and act in a scientific manner.
- Intellect is correlated with signs and with knowledge. Failure to use the intellect and blind following are condemned. Knowledge is supreme. It removes blind following.
- Human knowledge is limited. Knowledge is acquired by study. Humans were ordained to read.
- The Qur’an has used the term fiqh to refer to understanding which is deeper than knowing.
- The Qur’an puts emphasis on thinking. Thinking is based on empirical observation. The Qur’an emphasizes freedom of thought in the form of freedom of belief.
- Innovations in religion are prohibited but creativity is encouraged.
QUR’AN
AND RESEARCH 2: DESCRIPTIVE KNOWLEDGE IN THE QUR’AN
- Descriptive: describe things as they are
- The Qur’an described mountains, the barrier between two oceans, metal, the wind, plants, the sky, honey, and water.
- The Qur’an described the motion of the earth, the boats, the sun, the moon, the water, and of the wind.
- The Qur’an described processes such as making of iron, armor, dams, and boats.
- It described the creation of the human from dust. It describes the constant laws of nature, sunan al laah fi al kawn. The laws are fixed and stable and operate in various situations. Order is a law of nature. Recording of observations is emphasized.
QUR’AN
AND RESEARCH 3: ANALYTIC KNOWLEDGE IN THE QUR’AN
- Analytic: discover the relationship (sababiyat) among events or phenomena
- Tarbiyat qur’aniyat
- The Qur’an calls for evidence. It rejects false evidence and condemns non evidence-based knowledge such as sorcery, consulting fortune tellers, speculation or conjecture.
- Human thought is a tool and not an end in itself. It operates on the basis of empirical observations and revelation, both objective sources of information, thought that is not based on an empirical basis or revelation is speculative and leads to wrong conclusions.
- The Qur’an calls for objectivity. It condemns following subjective feelings and turning away from the truth. Reliance is on observation and not speculation.
- The Qur’an calls upon humans to observe Allah’s signs in the universe and in humans. The Qur’an however made it clear that human senses have limitations.
- Rational thinking and logical operations were described. In many prohibitions the Qur’an provides logical reasons.
- The use of similitude, tashbiih, of two things and phenomena is seen several verses. The Qur’an also employed many examples, mithl, to illustrate concepts. Prudence in reaching conclusions is emphasized.
QUR’AN
AND RESEARCH 4: ETIQUETTE OF SCIENTIFIC DISCOURSE IN THE QUR’AN
- Questions can be for finding out information.
- The opposing opinion should be respected. Differences on scientific matters can arise and are natural. Discussion and exchange of views is a necessity for humans.
- Discussion has its own etiquette: the truth must be revealed. Contradictions must be avoided. Arrogance is condemned.
- Attributes of good discussion: objectivity, truthfulness, asking for evidence, and knowledge. Purposeless disputation is frowned upon. False premises should be abandoned once discovered Fear of people should be no reason for not revealing the truth. Deception is condemned. The truth of any assertion must be checked. Yaqeen is the basis of ‘ilm but dhann is not.
THE
RESEARCH PROCESS
- Identifying and describing a problem
- Using the scientific method to formulate and test hypotheses
- Interpreting findings.
PHILOSOPHY
OF RESEARCH
- Health research, following the scientific method, is empirical, inductive, and refutative.
- Empirical: Health research relies on and respects only empirical findings. Empiricism refers to reliance on physical proof.
- Inductive: Induction is building a theory on several individual observations.
- Refutative: Refutation is basically refusal of a supposition until it is proved otherwise. Deterministic vs relative
TYPES
OF RESEARCH 1
- Theoretical
- Epidemiological
- Clinical
- Laboratory
ETHICO-LEGAL
ISSUES IN HEALTH RESEARCH
- A study involving humans must get approval from a recognized body. For approval the study must fulfill certain criteria. It must be scientifically valid. It is unethical to waste resources (time and money) on a study that will give invalid conclusions.
- Among ethical considerations are: individual vs. community rights, benefits vs. risks, informed consent, privacy and confidentiality, and conflict of interest.
HYPOTHESES
AND THE SCIENTIFIC METHOD
- The scientific method consists of hypothesis formulation, experimentation to test the hypothesis, and drawing conclusions.
- Hypotheses are statements of prior belief. They are modified by results of experiments to give rise to new hypotheses. The new hypotheses then in turn become the basis for new experiments.
- The null or research hypothesis, H0, states that there is no difference between two comparison groups and that the apparent difference seen is due to sampling error.
- The alternative hypothesis, HA, disagrees with the null hypothesis.
- H0 and HA are complimentary and exhaustive. They both cover all the possibilities.
- A hypothesis can be rejected but cannot be proved.
- Although a hypothesis cannot be proved in a conclusive way, an objective measure of the probability of its truth can be given in the form of a p-value.
SAMPLE
SIZE DETERMINATION
- The size of the sample depends on the hypothesis, the budget, the study duration, and the precision required.
- If the sample is too small the study will lack sufficient power to answer the study question. A sample bigger than necessary is a waste of resources.
- Power is ability to detect a difference. The bigger the sample size the more powerful the study. Beyond an optimal sample size, increase in power does not justify costs of larger sample.
- There are procedures, formulas, and computer programs for determining sample sizes for different study designs.
HEALTH
RESEARCH: RESEARCH IN NURSING AND MEDICINE 2: STUDY DESIGN
CROSS
SECTIONAL DESIGN
- Study of the status of disease and its causes at a point in time
- Cross-sectional studies are used in community diagnosis, preliminary study of disease etiology, assessment of health status, disease surveillance, public health planning, and program evaluation.
- Cross-sectional studies have the advantages of simplicity, and rapid execution to provide rapid answers.
- The disadvantages are: inability to study etiology because the time sequence between exposure and outcome is unknown, inability to study diseases with low prevalence, high respondent bias, poor documentation of confounding factors, and over-representation of diseases of long duration.
HEALTH
SURVEYS
- Surveys involve more subjects than the usual epidemiological sample are used for measurement of health and disease, assessment of needs, assessment service utilization and care.
- Planning of surveys includes: literature survey, stating objectives, identifying and prioritizing the problem, formulating a hypothesis, defining the population, defining the sampling frame, determining sample size and sampling method, training study personnel, considering logistics (approvals, manpower, materials and equipment., finance, transport, communication, and accommodation), preparing and pre-testing the study questionnaire. Surveys may be cross sectional or longitudinal.
- The household is the usual sampling unit.
- Existing data may be used or new data may be collected using a questionnaire (postal, telephone, diaries, and interview), physical examinations, direct observation, and laboratory investigations. Structure and contents of the survey report is determined by potential readers. The report is used to communicate information and also apply for funding.
CASE-CONTROL
DESIGN
- The case-control study is popular because of its low cost, rapid results, and flexibility. It uses a small numbers of subjects.
- Cases are sourced from clinical records, hospital discharge records, disease registries, data from surveillance programs, employment records, and death certificates.
- Controls must be from the same population base as the cases and must be like cases in everything except having the disease being studied.
FOLLOW-UP
DESIGN
- A follow up study (also called cohort study, incident study, prospective study, or longitudinal study), compares disease in exposed to disease in non-exposed groups after a period of follow-up.
- Follow up can be prospective (forward), retrospective (backward), or ambispective (both forward and backward) follow-up.
RANDOMIZED
DESIGN: COMMUNITY TRIALS
- A community intervention study targets the whole community and not individuals.
- The Salk vaccine trial carried out in 1954 had 200,000 subjects in the experimental group and a similar number in the control group.
- The aspirin-myocardial infarction study was a therapeutic intervention that randomized 4524 men to two groups. The intervention group received 1.0 gram of aspirin daily whereas the reference group received a placebo.
- The Women’s Health Study involved randomization of 40,000 healthy women into two groups to study prevention of cancer and cardiovascular disease. One group received vitamin E and low dose aspirin. The other group received a placebo.
- The alpha tocopherol and beta carotene cancer prevention trial randomized 19,233 mid-age men who were cigarette smokers.
RANDOMIZED
DESIGN: CLINICAL TRIALS
- The aim of randomization in controlled clinical trials is to make sure that there is no selection bias and that the two series are as alike as possible by randomly balancing confounding factors.
- Patients are allocated randomly either to the new drug or the old drug and rates of cure or improvement are compared
HEALTH
RESEARCH: RESEARCH IN NURSING AND MEDICINE 3: DATA COLLECTION
SOURCES
OF INFORMATION 1
- Existing data or Studies (observational or experimental).
- Existing data is from census, medical facilities, government, and private sector, health surveys, and vital statistics.
- Experimental studies, natural or true experiments, involve deliberate human action or intervention whose outcome is then observed.
- The advantage of experimental studies is that of controlled conditions
- The disadvantage of experimental studies is the ethical problems of experimenting on humans.
SOURCES
OF INFORMATION 2
- Observational studies allow nature to take its course and just record the occurrences of disease and describe the what, where, when, and why of a disease.
- Observational epidemiological studies are of 3 types: cross-sectional, case-control, and follow-up/cohort studies. Special surveys cover a larger population that epidemiological studies and may be health, nutritional, or socio-demographic surveys.
- The advantage of observational studies is low cost and fewer ethical issues.
- They suffer from 3 disadvantages: disease etiology is not studied directly because the investigator does not manipulate the exposures, unavailability of information, and confounding.
SOURCES
OF SECONDARY / EXISTING DATA
- Secondary data is from decennial censuses, vital statistics, routinely collected data, epidemiological studies, and special health surveys.
- Census data is reliable. It is wide in scope covering demographic, social, economic, and health information.
- The census describes population composition by sex, race/ethnicity, residence, marriage, socio-economic indicators.
- Vital events are births, deaths, Marriage & divorce, and some disease conditions.
- Routinely collected data are cheap but may be unavailable or incomplete. They are obtained from medical facilities, life and health insurance companies, institutions (like prisons, army, schools), disease registries, and administrative records.
PRIMARY
DATA COLLECTION BY QUESTIONNAIRE
- Questionnaire design involves content, wording of questions, format and layout.
- The reliability and validity of the questionnaire as well as practical logistics should be tested during the pilot study.
- Informed consent and confidentiality must be respected.
- A protocol sets out data collection procedures.
- Questionnaire administration by face-to-face interview is the best but is expensive.
- Questionnaire administration by telephone is cheaper.
- Questionnaire administration by mail is very cheap but has a lower response rate.
- Computer-administered questionnaire is associated with more honest responses.
PHYSICAL
PRIMARY DATA COLLECTION
- Data can be obtained by clinical examination, standardized psychological/psychiatric evaluation, measurement of environmental or occupational exposure, and assay of biological specimens (endobiotic or xenobiotic) and laboratory experiments.
- Pharmacological experiments involve bioassay, quantal dose-effect curves, dose-response curves, and studies of drug elimination.
- Physiology experiments involve measurements of parameters of the various body systems.
- Microbiology experiments involve bacterial counts, immunoasays, and serological assays.
- Biochemical experiments involve measurements of concentrations of various substances.
- Statistical and graphical techniques are used to display and summarize this data.
DATA
ENTRY INTO THE COMPUTER
- Self-coding or pre-coded questionnaires are preferable.
- Data is input as text, multiple choice, numeric, date and time, and yes/no responses.
- In double entry techniques, 2 data entry clerks enter the same data and a check is made by computer on items on which they differ.
- Data in the computer can be checked manually against the original questionnaire.
- Interactive data entry enables detection and correction of logical and entry errors immediately.
- Data editing is the process of correcting data collection and data entry errors.
DATA
EDITING / MANAGEMENT
- The data is 'cleaned' using logical, statistical, range, and consistency checks. All values are at the same level of precision (number of decimal places) to make computations consistent and decrease rounding off errors.
- Data editing identifies and corrects errors such as invalid or inconsistent values. Data is validated and its consistency is tested.
THE
MAIN DATA PROBLEMS
- Missing data
- Coding and entry errors
- Inconsistencies
- Irregular patterns
- Digit preference
- Out-liers
- Rounding-off / significant figures
- Questions with multiple valid responses
- Record duplication.
DATA
TRANSFORMATION
- Data transformation is the process of creating new derived variables preliminary to analysis
- Data transformation includes mathematical operations such as division, multiplication, addition, or subtraction; mathematical transformations such as logarithmic, trigonometric, power, and z-transformations.
DATA
ANALYSIS
- Data analysis consists of data summarization, estimation and interpretation.
- Simple manual inspection of the data is needed before statistical procedures.
- Preliminary examination consists of looking at tables and graphics.
- The tests for association are the t, chi-square, linear correlation, and logistic regression tests or coefficients.
- The common effect measures Odds Ratio, Risk Ratio, Rate difference.
- Measures of trend can discover relationships that are not picked up by association and effect measures.
HEALTH
RESEARCH: RESEARCH IN NURSING AND MEDICINE 4: READING AND WRITING SCIENTIFIC
LITERATURE
6
LITERATURE SEARCH
- The best source of on-line documents is pubmed.gov.
- Articles are searched for by key words
- The search can be limited by subject matter, language, type of publication, and year of publication.
CRITICAL READING OF A JOURNAL
ARTICLE
For critical
reading of scientific literature, the reader must be equipped with tools to be
able to analyze their methodology and data analysis critically before accepting
their conclusions. Common problems in published studies are incomplete
documentation, design deficiencies, improper significance testing and
interpretation.
The main problem
of the title is irrelevance to the body of the article. Problems of the
abstract are failure to show the focus of the study and to provide sufficient
information to assess the study (design, analysis, and conclusions).
Problems of the
introduction are failures of the following: stating the reason for the study,
reviewing previous studies, indicating potential contribution of the present
study, giving the background and historical perspective, stating the study
population, and stating the study hypothesis.
Problems of
study design are the following: going on a fishing expedition without a prior
hypothesis, study design not appropriate for the hypothesis tested, lack of a
comparison group, use of an inappropriate comparison group, the Berkson's
fallacy, selection of cases and controls from different populations, and sample
size not big enough to answer the research questions. The following terms are often confused with one another. ‘Measurement’ is
using instruments. ‘Calculation’ deals with numbers and formulas. ‘Estimation’
is used in two senses as an approximation in measurements or as computation of
statistical parameters. ‘Determination’ is a general term for getting to a
conclusion by use of the 4 methods above. The term ‘study’ is generic and can
be confused with experiment that refers to only some types of studies.
Problems in
data collection are: missing data due to incomplete coverage, loss of
information due to censoring and loss to follow-up, poor documentation of data
collection, and methods of data
collection inappropriate to the study design.
Problems of data
analysis are failures in the following: stating type of hypothesis testing (p
value or confidence interval), use of the wrong statistical tests, drawing
inappropriate conclusions, use of parametric tests for non-normal data,
multiple comparisons or multiple significance testing, assessment of errors,
assessment of normality of data, using appropriate scales and tests, using the
wrong statistical formula, and confusing continuous and discrete scales.
Problems in
reporting results are: selective reporting of favorable results, numerators
without denominator, inappropriate denominators, numbers that do not add up,
tables not labeled properly or completely, numerical inconsistency (rounding,
decimals, and units), stating results as mean +/- 2SD for non-normal data, stating
p values as inequalities instead of the exact values, missing degrees of
freedom and confidence limits.
Problems of the
conclusion are failures in the following: repeating the results section,
discussion of the consistency of conclusions with the data and the hypothesis,
extrapolations beyond the data, discussing short-comings and limitations of the
study, evaluation of statistical conclusions in view of testing errors,
assessment of bias (misclassification, selection, and confounding), assessment of
precision (lack of random error), and assessment of validity (lack of
systematic error).
Internal
validity is achieved when the study is internally consistent and the results
and conclusions reflect the data. External validity is generalizability (i.e.
how far can the findings of the present study be applicable to other
situations) and is achieved by several independent studies showing the same
result. Inability to detect the outcome
of interest due to insufficient period of follow-up, inadequate sample size,
and inadequate power.
ABUSE or MISUSE
OF STATISTICS
Statistics can be abused by incomplete and
inaccurate documentation of results as well as selection of a favorable rate
and ignoring unfavorable ones. This is done by 'playing' either with the
numerator or the denominator. The scales of numerators and denominators can be
made artificially wider or narrower giving false and misleading impressions.
Statistical results are misleading in the following situations: (a) violating
the principle of parsimony, (b) study objective unclear and not reflected in
the study hypothesis (c) fuzzy, inconsistently, and subjective definitions (of
cases, non-cases, the exposed, the non-exposed, comparison groups, exposure,
method of measurement), (f) incomplete information on response rates and
missing data.
SCIENTIFIC
WRITING
The goal of scientific writing is clarity. The
following must be observed about sentences: short concise sentences, use of personal
pronouns, subject-verb agreement is a common mistake, using active and avoiding
passive sentences, proper organization of parallel ideas, and proper use of
parentheses.
A paragraph must start with a short and simple topic
sentence that is an overview of the message contained in that paragraph. Each
paragraph should convey only one message. The sentences following the topic
sentence provide details and support for the topic sentence. Ideas in a
paragraph should be presented in the right order with no missing steps using
one of the following alternatives: least to most important, most to least
important, concise to the detailed, time chronological order, problem followed
by solution, or solution followed by the problem. Links and
transitions such as ‘which is’ should
be used when moving from one group of ideas to another to ensure continuity in
the paragraph. There must be consistency in the order in which information is
mentioned. If certain objects were mentioned in a certain order in the
introduction, they must be mentioned in the same order all through the writing.
The writer should maintain a consistent viewpoint all through the paper and not
appear to be jumping from point to point. Important messages must be given
emphasis.
The purpose of the title is to identify the main
topic or message of the paper so as to attract readers. A good title is
unambiguous, concise, and contains important words. It should contain the
following: independent variable(s), dependent variable(s), the study subjects
or materials, and statement of the main message like ‘to study the effect of’,
‘to determine’ etc.
The abstract is an overview of the report with a few
significant details. It should be written to be read by both those who read the
full paper and those who do not read the full paper. Normally the abstract
should not exceed 250 words. The abstract should mirror the sections of the
paper: introduction, materials & methods, results, and discussion. The
present tense is used to state the research hypothesis and the answer. The past
tense is used for the experiment. An abstract is accompanied by keywords that
are used for indexing.
The introduction should be short. It should start
with stating the research question or research hypothesis and then go on to
elaborate. The transition should be from the known to the unknown and from the
big picture to the detail. The introduction should mention the type of study,
the study subjects or materials (substances, animals, and persons). In some
cases the introduction may briefly mention the proposed experimental approach
to answering the research question. Results should not be mentioned in the
introduction. The introduction should
state whether the work is new or original.
The aim of the materials and methods section is to
describe the experimental techniques in detail sufficient for another trained
scientist to replicate the procedures. The order of presentation is different
for animal and clinical studies. For animal studies the order is: materials and
animals, preparation, study design, interventions, methods of measurement, calculations,
and data analysis. For clinical studies the order is: study subjects, inclusion
criteria, exclusion criteria, study design, interventions, method of
measurement, calculations, and data analysis. Independent and dependent
variables should be identified. Intermediate results can be put in the
materials and methods section. Final results should be put only in the results
section. Details of sample size determination should be provided.
The results section presents the findings of the
procedures carried out in the methods section. It should be brief and to the
point. A distinction must be made between results and data. Result refers to
summary information obtained from data analysis. Results of hypothesis-based
studies should be in the past tense. Data of descriptive studies should be in
the present tense. Data is the actual numerical information often presented in
a summarized form. The result is presented followed by presentation of
supporting data. Data are presented in the form of tables and diagrams
(figures, bar diagrams, graphs, pie-charts, maps etc). Presentation of
numerical data in text should be kept to a minimum. Only results relevant to
the research hypothesis should be presented. Both negative and positive results
are presented. It is considered scientific fraud to present only those results
that the author thinks favor a particular hypothesis. The results section is
written in chronological order. The most important results are presented before
the least important. Magnitude of change should be presented as a summary
statistic such as percentage change instead of presenting the raw data. Summary
statistics normally used as the mean, the median, and the the proportion. The
mean should be presented properly as mean +/- standard deviation or standard
error of the mean (SD or SE) with units of measure indicated. Measures of
effect are normally the chisquare and the t statistics. Actual p values should
be given instead of indicating <0.05 or >0.05. When specifying the sample
size the type of sample should be explained for example ‘the sample was 20
rats’ instead of the sample size was 20’. Emphasis can be put on some results
and not others. Not all the data from the study need be reported. Citing data
in the text takes less space but is more difficult to read. A topic sentence is
used to give an overview. Important results are put first.
Figures used to present results must have a strong
visual impact and must be simple. The following types of figures are used: line
graph, scatter-gram, bar graph, histogram, and the frequency polygon. The title
of the figure should reflect its contents. It must be labeled correctly.
Symbols must be defined. The names of variables and units of measurement must
be labeled appropriately. Tables must be properly titled and column headings
clearly indicated. Footnotes, subscripts, and superscripts can be used.
The discussion section states the research
hypothesis, answers it, and supports the answers using data from the current
study and other studies. It provides reasons to show that the answer to the
question is reasonable. It explores and explains possible sources of error and
bias. It also identifies and explains differences between the study results and
published results. As part of intellectual honesty it discusses the strengths
as well as the weaknesses of the study and how they impact on the
interpretation of the results. Issues of validity and precision are also
addressed. Also discussed is whether the result is new and how important it is.
References are used to acknowledge information
obtained from others. The references must be the most recent and most easily
available on the subject. Review articles are better than original articles.
They may be journal articles, books, PhD theses, abstracts of meetings, or
conference proceedings. The reference should be put immediately after the
relevant text. If there are several references in a sentence, cite each
reference at the relevant point and do not wait to put all of them at the end
of the sentence. References should be written using the Vancouver style which
is: Author. Title. Journal Year; Volume (number): starting page – ending page.