search this site.

200621P - VARIABLES AND STATISTICAL DISTRIBUTIONS

Print Friendly and PDFPrint Friendly

Presented in the Biostatistics module of the Clinical Research Coordinators Course on June 21, 2020, 13.00-15.00 by Professor Omar Hasan Kasule MB ChB (MUK), MPH (Harvard), DrPH (Harvard) Professor of Epidemiology and Bioethics King Fahad Medical City


CONSTANTS AND VARIABLES

A constant has only one unvarying value under all circumstances for example p and c = speed of light. 

A random variable can be qualitative (descriptive with no intrinsic numerical value) or quantitative (with intrinsic numerical value). 

A random quantitative variable results when numerical values are assigned to results of measurement or counting. It is called a discrete random variable if the assignment is based on counting. It is called a continuous random variable if the numerical assignment is based on measurement. 

The numerical continuous random variable can be expressed as fractions and decimals. The numerical discrete random variable can only be expressed as whole numbers. 


VARIABLES AND STATISTICAL DISTRIBUTIONS

Statistical distributions are a graphical representations of mathematical functions of random variables. Each random variable has a corresponding statistical distribution that specifies all possible values of a variable with the corresponding probability. 

Each statistical distribution is associated with a specific statistical analytic technique.

Choice of the technique of statistical analysis depends on the type of statistical distribution. The statistical distribution depends on the type of random variable.


QUALITATIVE (NON NUMERICAL) RANDOM VARIABLES

Qualitative variables (nominal, ordinal, and ranked) are attribute or categorical with no intrinsic numerical value. 

The nominal has no ordering

the ordinal has ordered, 

the ranked has observations arrayed in ascending or descending orders of magnitude.


QUANTITATIVE (NUMERICAL) DISCRETE RANDOM VARIABLES

The discrete random variables are the Bernoulli, the binomial, the multinomial, the negative binomial, the Poisson, the geometric, the hypergeometric, and the uniform. Each random variable is associated with a statistical distribution.

The Bernoulli is the number of successes in a single unrepeated trial with only 2 outcomes. . [Line graph of a Bernoulli distribution]

The Binomial is the number of successes in more than 2 consecutive trials each with a dichotomous outcome. [Line graph of a binomial distribution]

The Multinomial is the number of successes in several independent trials with each trial having more than 2 outcomes. [Line graph of a multinomial distribution]

The Negative binomial is the total number of repeated trials until a given number of successes is achieved. [Line graph of a negative binomial distribution]

The Poisson is the number of events for which no upper limit can be assigned a priori. It has to do with rare events in a given time or space [Line graph of a Poisson distribution)

The Geometric is the number of trials until the first success is achieved. [Line graph of a hypergeometric distribution]

The Hypergeometric is the number selected from a sub-sample of a larger sample for example selecting males from a sample of n persons from a population N. [Line graph of a hypergeometric distribution]

The Uniform has the same value at repeated trials.


QUANTITATIVE (NUMERICAL) CONTINUOUS RANDOM VARIABLES

The continuous random variables can be natural such as the normal, the exponential, and the uniform, or artificial such as chi-square, t, and F variables. 

The Normal represents the result of a measurement on the continuous numerical scale such as height and weight. [Line graph of a normal distribution]

The Exponential is the time until the first occurrence of the event of interest. [Line graph of an exponential distribution]

The uniform represents the results of a measurement and takes on the same value at repeated trials. 


SCALES OF QUANTITATIVE (NUMERICAL) CONTINUOUS RANDOM VARIABLES

The continuous R.V can be measured on either the interval or the ratio scales. Only 2 measurements are made on the interval scale, the calendar, and the thermometer. The rest of the measurements are on the ratio scale. 

Properties of the interval scale: the difference between 2 readings has a meaning, the magnitude of the difference between 2 readings is the same at all parts of the scale, the ratio of 2 readings has no meaning, zero is arbitrary with no biological meaning, and both negative and positive values are allowed. 

Properties of the ratio scale: zero has a biological significance, values can only be positive; the difference between 2 readings has a meaning, the ratio of 2 readings has a meaning and can be interpreted, and intervals between 2 readings have the same meaning at different parts of the scale. 


SIX PROPERTIES OF A RANDOM VARIABLE (DISCRETE OR CONTINUOUS)

The expectation of a random variable is a central value around which it hovers most of the time for example the mean(average). 

The variations of the random variable around the expectation are measured by its variance. 

Covariance measures the co-variability of the two random variables. 

Correlation measures the linear relation between two random variables. [line graph showing correlation]

Skewness measures the bias of the distribution of the random variable from the center. [figure showing positive skew and figure showing negative skew]

Kurtosis measures the peakedness of the random variable is at the point of its expectation. [Figure showing kurtosis]


TRANSFORMATIONS OF RANDOM VARIABLES

Quantitative variables can be transformed into qualitative ones. 

Qualitative variables can be transformed into quantitative ones but this is less desirable. 

The continuous variable can be transformed into the discrete variable. 

Transformation of the discrete into the continuous may be misleading.