search this site.

220113P - DESCRIPTIVE STATISTICS FOR CONTINUOUS DATA

Print Friendly and PDFPrint Friendly

Presented at the Research Methodology Winter Camp of AlMaarefa University on January 11, 2022 at 1.00pm by Omar Hasan Kasule MB ChB (MUK), MPH (Harvard), DrPH (Harvard) Professor of Epidemiology and Bioethics

 

CONCEPT OF AVERAGES (measures of central location):

  • Biological phenomena vary around the average. The average represents what is normal by being the point of equilibrium.
  • The average is a representative summary of the data using one value.
  • Three averages are commonly used: the mean, the mode, and the median.
  • The arithmetic mean is considered the most useful measure of central tendency in data analysis.
  • The median is gaining popularity. It is the basis of some non-parametric tests as will be discussed later.
  • The mode has very little public health importance.

 

DEFINITIONS OF COMMONLY USED AVERAGES:

  • The arithmetic mean is the sum of the observations' values divided by the total number of observations.
  • The mode is the value of the most frequent observation.
  • The median is the value of the middle observation in a series ordered by magnitude.
  • Mean = mode = median for symmetrical data.

 

CONCEPT OF VARIATION:

  • Variations are due to biological changes, measurement errors, or temporal changes due to the passage of time.
  • Biological variation is more common than measurement variation.
  • Measures of variation are based on the mean such as the standard deviation and the z score or on quantiles such as percentiles.

 

DEFINITION OF THE STANDARD DEVIATION, a measure of variation based on the mean:

  • The variance is the sum of the squared deviations of each observation from the mean divided by the sample size, n, (for large samples) or n-1 (for small samples)
  • The standard deviation, the commonest measure of variation, is the square root of the variance.
  • The percentage of observations covered by mean +/- 1 SD is 66.6%.
  • The percentage of observations covered by mean +/- 2 SD is 95%.
  • Virtually 100% of observations are covered by mean +/- 4 SD.