search this site.

1005P- RESEARCH QUESTIONS AND HYPOTHESES

Print Friendly and PDFPrint Friendly

Presented at a Training Workshop on Research Methodology held at the Faculty of Medicine King Fahad Medical City 18th May 2010 by Dr Omar Hasan Kasule Sr MB ChB (MUK), MPH (Harvard), DrPH (Harvard) Professor of Epidemiology and Bioethics omarkasule@yahoo.com.

1.0 PURPOSES OF RESEARCH
The main purposes of epidemiological research can be listed as exploration, description, explanation, and prediction. Exploratory studies are preliminary and have the objective of obtaining basic information about a disease and its potential causes in order to enable formulation of causal hypotheses that can be tested in more sophisticated studies. Descriptive studies characterize a disease in terms of place, time, and person. Explanatory studies seek to establish causal relations between a disease and its risk factors. Prediction uses existing epidemiological profiles of a disease and its exposures to predict future disease patterns.

2.0 STEPS OF AN EPIDEMIOLOGIC INVESTIGATION
An epidemiological investigation proceeds in several stages. It starts by identifying the problem ie a decision has to be made that a public health or medical problem exists. This is followed by description of the extent and distribution of the problem. Hypotheses are then formulated about the causes of the problem using the procedures of the scientific method. Appropriate studies are designed to test the hypotheses. Epidemiological information is sourced from existing data or studies (observational or experimental).  An epidemiologic study involves data collection, data analysis, and data interpretation. Biostatistics is the technology of the scientific method that enables sophisticated data analysis and interpretation.

Epidemiological methodology, following the scientific method, is empirical, inductive, and refutative. Epidemiology relies on and respects only empirical findings. Empiricism refers to reliance on physical proof. Induction is building a theory on several individual observations. Refutation is basically refusal of a supposition until it is proved otherwise. Epidemiological investigation is not as deterministic as laboratory investigation but is cheap and easy.

3.0 RESEARCH QUESTIONS & CONCLUSIONS: STATISTICAL vs SUBSTANTIVE:
An investigator starts with a substantive question. This is formulated as a statistical question. Data is then collected and is analyzed to answer the statistical question. The answer to the statistical question is the statistical conclusion. The investigator uses the statistical conclusion and other knowledge available to him to reach a substantive conclusion. Statistics therefore gives statistical and not substantive answers.

A substantive question is the subject matter stated in ordinary language. Technical terminology may or may not be used. The less technical the formulation is, the better to enable statisticians who are not specialists in the subject matter can understand. Care must be taken to make sure that accuracy and exactness are not sacrificed for the sake of simplification.

A statistical question is when the substantive question is stated using statistical language. Since the language of statistics is mathematical, the statistical question is stated as numbers, parameters, relations of equality, and relations of inequality.

A statistical conclusion is the result of mathematical manipulation of parameters or data. Statistical conclusions are made about groups and not individuals. Any inference to the individual is to a hypothetical individual. In other words the statistical conclusion is depersonalized. 

A substantive conclusion is the translation of the statistical conclusion back to normal language to answer the substantive question that was posed at the start.

4.0 FORMULATION OF HYPOTHESES:
Epidemiology, as a scientific discipline, uses the procedures of the scientific method.  The procedures require stating a hypothesis, collecting and analyzing data to test the hypothesis, and reaching conclusions about the hypothesis.

Continuous improvement of knowledge is by generating and testing hypotheses. The hypothesis is based on previous knowledge or data. It may also be based on purely theoretical or intuitive considerations.

An epidemiological hypothesis is formulated to relate two phenomena: the disease and the putative cause of the disease (the exposure or risk factor). Four methods are generally used in such a formulation: difference, agreement, concomitant variation, and analogy.

A causal hypothesis can be generated by looking for the difference between two situations, one that leads to disease and the other does not. The difference(s) between the two situations may be the putative cause of the disease.

Agreement involves similarities between different situations that lead to the same disease. The common factor on which the similarity or agreement is based could be the putative cause of disease. If variation in a putative causal factor is always associated with concomitant variation in disease occurrence, then that factor is likely to be etiologically important.

The method of analogy is used to generate a causal hypothesis by looking at a causal relation between two phenomena and understanding its mechanism. If that mechanism is relevant to another disease situation, then the cause in the first instance could also be the cause in the second instance.

Hypotheses must be specific and testable. Empirical data is from experimentation and observation. The conclusions from testing a hypothesis can be rejection/non-rejection and never acceptance. A new hypothesis is generated from the conclusion and the process is repeated. Use of the scientific method implies among other things that epidemiological knowledge is never stable. It keeps on changing and getting nearer the truth as new information is discovered.

5.0 TYPES OF HYPOTHESES
A hypothesis is a statement of belief in something. Unlike other types of beliefs, scientific beliefs are subject to experimental verification. Two hypotheses are always stated for proper scientific investigation: the null and the alternative hypotheses.

The null hypothesis or research hypothesis, H0, states that there is no difference between the two comparison groups and that the apparent difference seen is due to sampling error.

The alternative hypothesis, HA, disagrees with the null hypothesis and states that there is a real difference not explained by sampling error. H0 and HA are complimentary and exhaustive in that between them they cover all the possibilities. HA could be vague. When H0 is rejected, we cannot accept HA we only fail to reject it.

The aim of hypothesis testing is to make a conclusion about H0. The conclusion is in the form of rejecting or not rejecting the hypothesis. If H0 is rejected, HA becomes the new working hypothesis. A hypothesis cannot be proved; you only give an objective measure of probability of its truth

6.0 IMPLICATIONS OF STATISTICAL SIGNIFICANCE
Implications of statistically significant
·                H0 is false
·                H0 is rejected
·                Observations are not compatible with H0
·                Observations are not due to sampling variation
·                Observations are real/true biological phenomenon

Implications of not statistically significant
·                H0 is not false (we do not say true)
·                H0 is not rejected
·                Observations are compatible with H0
·                Observations are due to sampling variation or random errors of measurement.        
·                Observations are artificial, apparent and not real biological phenomena

Statistical and practical significance
Statistically significant may have no clinical/practical significance/importance. This may be due to (a) other factors being involved and not studied here (b) measurements that are not valid. Clinically important difference may not reach statistical significance due to 2 main reasons: (a) small sample size (b) measurement that are not discriminating enough