Integrated Medical Education Resources: 210429P

Presented at a workshop on Module II: Study Design - Clinical Research Program held at King Fahad Medical City via Zoom on 29 April 2021 at (10:00 am - 12:00 pm). By Professor Omar Hasan Kasule Sr. MB ChB (MUK). MPH (Harvard), DrPH (Harvard) Professor of Epidemiology and Bioethics

LEARNING OBJECTIVES

Follow-up studies: Definition and types
Follow-up studies: Design an analysis
Follow-up studies: Strengths and weaknesses
Determination of the sample size

KEYWORDS and TERMS

Definition
Cohort: study, closed cohort, open cohort
Cumulative incidence
Follow-up bias, study, ambispective, retrospective, prospective
Loss to follow-up

DEFINITION

A follow-up study (also called cohort study, incident study, prospective study, or longitudinal study), compares disease in exposed to disease in non-exposed groups after a period of follow-up.
A follow-up study can be prospective (forward), retrospective (backward), or ambispective (both forward and backward) follow-up.
In a nested case-control design, a case-control study is carried out within a larger follow-up study.
The follow-up cohorts may be closed (fixed cohort) or open (dynamic cohort).
Analysis of fixed cohorts is based on CI and that of open cohorts on IR.

STUDY DESIGN

The study population is divided into the exposed and unexposed populations.
A sample is taken from the exposed and another sample is taken from the unexposed.
Both the exposed and unexposed samples are followed for the appearance of disease.
The study may include matching, (one-to-one or one-to-many), pre and post-comparisons, multiple control groups, and stratification.

SOURCES of COHORTS

Special exposure groups such as factory workers.
Groups offering special resources such as health insurance subscribers.
Institutionalized such as the army, and police.

SOURCES of EXPOSURE INFORMATION

Existing records,
Interviews/questionnaires,
Medical examinations,
Laboratory tests for biomarkers,
Testing or evaluation of the environment.

OUTCOME ASSESSMENT

The time of occurrence of the outcome must be defined precisely.
The ascertainment of the outcome event must be standardized with clear criteria.
Follow-up can be achieved by letter, telephone, surveillance of death certificates, and hospitals.
Care must be taken to make sure that surveillance, follow-up, and ascertainment for the 2 groups are the same.

PROBLEMS of NON-RESPONSE in FOLLOW UP STUDIES

In non-random non-response on exposure, the risk ratio is valid but the distribution of exposure in the community is not valid.
In non-random non-response on outcome, the odds ratio is valid but the disease incidence rate is not valid.
There is a more complex situation when there is non-response on both exposure and outcome.
In general, random non-response is better than non-random or differential non-response.

PROBLEM of LOSS to FOLLOW UP

Loss to follow-up can be related to the outcome, the exposure, and both outcome and exposure.
The consequences of loss to follow-up are similar to those of non-response.
In cases of regular follow-up, it is assumed that the loss occurred immediately after the last follow-up.
If the loss to follow-up is related to an event such as death, it can be assumed that the loss was halfway between the last observation and the death.

FIVE TYPES of BIAS CAN ARISE in FOLLOW UP STUDIES

Selection bias arises when the sample is not representative of the population.
Follow-up bias arises when the loss to follow-up is unequal among the exposed and the unexposed when disease occurrence leads to loss to follow-up, when people may move out of the study area because of the exposure being studied, and when the observation of the two groups is unequal.
Information/misclassification bias arises due to measurement error or misdiagnosis.
Confounding bias arises usually due to age and smoking because both are associated with many diseases.
Post-hoc bias arises when cohort data is used to make observations that were not anticipated before.

STATISTICAL PARAMETERS

Both incidence and risk statistics can be computed.
The incidence statistics are the incidence rate and the cumulative incidence.
The risk statistics are either the risk difference or the various ratio statistics (risk ratio, the rate ratio, the relative risk, or the odds ratio).

ADVANTAGES of the FOLLOW-UP DESIGN

True risk ratio based on incidence rates,
The time sequence is clear since exposure precedes disease,
Incidence rates can be determined directly,
Several outcomes of the same exposure can be studied simultaneously.

DISADVANTAGES of the FOLLOW-UP DESIGN

Loss of subjects and interest due to long follow-up,
Inability to compute the prevalence rate of the risk factor,
Use of large samples to ensure enough cases of the outcome,
High cost: cost can be decreased by using existing monitoring/surveillance systems, historical cohorts, general population information instead of studying the unexposed population, and the nested case-control design.
Follow-up studies are not suitable for the study of diseases with low incidence.

2 x2 TABLE

	Exposed	Unexposed	Total
Cases	a	b	m₁
Noncases	c	d	m₀
Person time	T₁	T₀	T

STATISTICAL COMPUTATIONS

The incidence rate can be computed separately for each of the 2 groups.
Incidence is defined as the number of cases of a disease divided by the total person-time of follow-up thus IR = n/PT.
If the period of follow-up is long, IR can be computed for several time intervals, thus the interval incidence is defined as IR_j= n_j/ PT_j.
The MH chi-square is computed as [a – mT₁/T] / [mT₁ T₀ / (T₀)²].
The incidence rate difference IRD is computed as a/T₁ – b/ T₀ with 95% confidence intervals of IRD +/- 1.96 {(IRD)²/c²}^1/2 = IRD (1 +/- 1.96/c).
The incidence rate ratio, IRR, is computed as (a/T₁) / (b/T₀) with 95% confidence intervals computed as (IRR)^{1+/- 1.96/}^c.

HISTORICAL PROSPECTIVE FOLLOW UP^[1]

In 1951 the British Medical Association forwarded to all British doctors a questionnaire about their smoking habits, and 34440 men replied. With few exceptions, all men who replied in 1951 have been followed for 20 years.
The certified causes of all 10 072 deaths and subsequent changes in smoking habits were recorded.
The ratio of the death rate among cigarette smokers to that among lifelong non-smokers of comparable age was, for men under 70 years, about 2:1, while for men over 70 years, it was about 1-5:1.
These ratios suggest that between a half and a third of all cigarette smokers will die because of their smoking if the excess death rates are actually caused by smoking.
To investigate whether this is the case, the relation of many different causes of death to age and tobacco consumption was examined, as were the effects of giving up smoking. Smoking caused death chiefly by heart disease among middle-aged men (and, with less extreme relative risk, among old men,) lung cancer, chronic obstructive lung disease, and various vascular diseases.
The distinctive features of this study were the completeness of follow-up, the accuracy of death certification, and the fact that the study population as a whole reduced its cigarette consumption substantially during the period of observation. As a result, lung cancer grew relatively less common as the study progressed, but other cancers did not, thus illustrating in an unusual way the causal nature of the association between smoking and lung cancer.

HISTORICAL PROSPECTIVE FOLLOW UP^[2]

In order to investigate the nature of the association between involuntarily delayed first birth and risk of breast cancer, 1083 white women who had been evaluated and treated for infertility from 1945-1965 were followed prospectively through April 1978 to ascertain their breast cancer incidence.
These women were categorized as to the cause of infertility into two groups, those with endogenous progesterone deficiency (PD) and those with nonhormonal causes (NH).
Women in the PD group had 5.4 times the risk of premenopausal breast cancer compared to women in the NH group.
This excess risk could not be explained by differences between the two groups in ages at menarche or menopause, history of oral contraceptive use, history of benign breast disease or age at first birth.
Women in the PD group also experienced a 10-fold increase in deaths from all malignant neoplasms compared to the NH group.
The incidence of postmenopausal breast cancer did not differ significantly between the two groups.

CONTEMPORARY PROSPECTIVE FOLLOW UP^[3]

Postprandial hypotension (PPH) is common among the elderly. However, it is unknown whether the presence of PPH can predict the development of new cardiovascular disease (CVD) in the elderly during the long-term period.
This study aimed to prospectively evaluate the presence of PPH and the development of new CVD within a 36-month period in 94 community-dwelling elderly people without a history of CVD.
PPH was diagnosed in 47 (50.0%) participants at baseline and in 7 (7.4%) during the follow-up period. Thirty participants (31.9%) developed new CVD within 36 months.
The multivariate analysis indicated that the relationship between PPH and the development of new CVD remained even after controlling for other variables as covariates.
In conclusion, the presence of PPH can predict the development of new CVD. Elderly people with PPH may require close surveillance to prevent CVD.

CONTEMPORARY RETROSPEC-TIVE FOLLOW UP^[4]

The aim of this study was to quantify the risk of hair loss with different antidepressants. A retrospective cohort study design using a large health claims database in the USA from 2006 to 2014 was utilized.
A cohort of new users and mutually exclusive users of fluoxetine, fluvoxamine, sertraline, citalopram, escitalopram, paroxetine, duloxetine, venlafaxine, desvenlafaxine, and bupropion were followed to the first diagnosis of alopecia.
The cohort was comprised of 1 25 40 new users of fluoxetine, fluvoxamine, sertraline, citalopram, escitalopram, paroxetine, duloxetine, venlafaxine, desvenlafaxine, and bupropion, with sertraline the most commonly prescribed (N=190 27) and fluvoxamine (N=3010) the least prescribed.
Compared with bupropion, all other antidepressants had a lower risk of hair loss.
The results of this large population-based cohort study suggest an increase in the risk of hair loss with bupropion compared with selective serotonin reuptake inhibitors and selective norepinephrine reuptake inhibitors, whereas paroxetine had the lowest risk.

COMPARISON of RETROSPECTIVE vs PROSPECTIVE FOLLOW UP

Retrospective is based on information in old stored records or analysis of biological samples.
Prospective is based on new information generated during the study period.
The quality of data in retrospect is difficult to ascertain because the researcher was not there.
The quality of data in perspective can be controlled and ascertained.
Confounders are more difficult to identify and control in retrospective follow-up.

COMPARISON of FOLLOW-UP and CASE-CONTROL STUDIES

	FOLLOW-UP STUDY	CASE-CONTROL STUDY
Source of data	New data collection	Uses existing data + additional information as needed
Sample size	Big numbers	Small numbers
Observer bias	Knowledge of exposure Affects the report or search for a diagnosis	Knowledge of disease affects interview or report of exposure
Respondent bias	Unlikely: response on exposure given before illness	Likely: response on exposure given after illness leads to recall bias
Risk Estimates & comparisons	Pr(D+/E+) vs Pr(D+/E-)	Pr(E+/D+) vs Pr(E+/D-)
Study duration	Usually long (more than 1 year)	Usually brief
Disease incidence	Best for diseases with High incidence	Good for both low and high-incidence diseases
Cost of study	Expensive	Cheap
Sampling	Probability sample easy to select	A probability sample of controls is difficult to select; so they are like cases except for disease
Study starting point	Exposure initiation	Disease diagnosis
Sample size determinant	Disease frequencies	Exposure frequencies

COMPARISON of FOLLOW-UP and EXPERIMENTAL STUDIES

	FOLLOW-UP STUDY	EXPERIMENTAL
Source of data	New data collection	New data collection
Observer bias	Knowledge of exposure Affects the report of diagnosis	Knowledge of allocation group affects the assessment of the outcome
Respondent bias	Unlikely: response on exposure given before illness	Unlikely; the outcome is assessed and not asked about
Risk Estimates & comparisons	Pr(D+/E+) vs Pr(D+/E-)	Pr(D+/E+) vs Pr(D+/E-)
Study duration	Usually long (more than 1 year)	Usually long more than 1 year
Disease incidence	Best for diseases with High incidence	Best for situations with a high occurrence of the outcome
Cost of study	Expensive	Expensive
Sampling	Probability sample easy to select	Randomization and not probability sample used
Study starting point	Exposure initiation	Point of randomization
Ethical issues	Fewer	Many
Sample size determinant	Disease frequencies	Outcome frequency

REFERENCES:

Doll R, Peto R. Mortality in relation to smoking: 20 years' observations on male British doctors. Br Med J. 1976 Dec 25;2(6051):1525-36.
Cowan LD, Gordis L, Tonascia JA, Jones GS. Breast cancer incidence in women with a history of progesterone deficiency. Am J Epidemiol. 1981 Aug;114(2):209-17.
Jang A. Postprandial Hypotension as a Risk Factor for the Development of New Cardiovascular Disease: A Prospective Cohort Study with 36 Month Follow-Up in Community-Dwelling Elderly People. J Clin Med. 2020 Jan 27;9(2).
Etminan M1, Sodhi M2, Procyshyn RM3, Guo M1, Carleton BC. Risk of hair loss with different antidepressants: a comparative retrospective cohort study. Int Clin Psychopharmacol. 2018 Jan;33(1):44-48

210429P - FOLLOW-UP (COHORT) DESIGN