Presentation at a Research Methodology Winter Camp AlMaarefa University on 13 January 2022 at 1.00pm. By: Prof. Omar Hasan Kasule Sr. MB ChB (MUK). MPH (Harvard), DrPH (Harvard) Professor of Epidemiology and Bioethics, King Fahad Medical City
LEARNING
OBJECTIVES:
- Concepts of dependent and independent variables in a scatter-gram.
- Difference between correlation and regression.
- The linear regression line and the linear regression equation (Y=a + bx).
- Interpreting the intercept, a (+ve and -ve).
- Interpreting the regression coefficient, b (+ve, -ve, range -infinity to +infinity).
- Assumptions of linear regression for y: normality and homoscedasticity.
- Use of the t-test for significance testing of b.
- Use of the regression line/equation for testing of association.
- Use of regression for prediction (interpolation and extrapolation).
DEFINITION
OF SIMPLE LINEAR REGRESSION:
- Regression to
the mean, first described by Francis Galton (1822-1911) is one of the
basic laws of nature, sunan al llah fi al kawn.
- Regression
relates independent with dependent variables.
- The variables
may be raw data (height and weight),
dummy indicator variables (1=male 2=female), or scores (GPA).
- The simple
linear regression equation is y=a + bx where:
▻
y is the dependent/response variable,
▻
a is the intercept,
▻
b is the slope/regression
coefficient,
▻
x is the independent/predictor
variable.
FIGURE
OF A SIMPLE LINEAR REGRESSION:
Weight
= 80 + 2 x (70) = 220 lbs
ASSUMPTIONS
OF SIMPLE LINEAR REGRESSION:
- Linearity of
the x-y relation.
- Normal
distribution of the y variable for any given value of x.
- Homoscedacity
(constant y variance for all x values),
- y values are
independent for each value of x.
USE
OF THE T-TEST TO TEST SIGNIFICANCE OF ‘B’ AND TESTING OF ASSOCIATION:
- The t-test can
be used to test the significance of the regression coefficient.
- The t-test
can be used to compare regression coefficients of 2 lines.
- A significant
‘b’ significant relation between ‘x’ and ‘y’.
USE
OF REGRESSION FOR PREDICTION (INTERPOLATION AND EXTRAPOLATION):
- Simple linear
regression is used in the prediction of y for given values of x.
- Both linear
extrapolation and linear interpolation are possible; the latter is more
reliable. Extrapolation is the prediction for x values outside the range of
the data on which the regression model was built.
- Interpolation
is prediction within the range of the data.
- A hidden
extrapolation is when an attempt is made to predict y within the range of
the data but that value of y cannot occur in nature.
- Difference
between actual and predicted values.