search this site.

220113P - SIMPLE LINEAR REGRESSION

Print Friendly and PDFPrint Friendly

Presentation at a Research Methodology Winter Camp AlMaarefa University on 13 January 2022 at 1.00pm. By: Prof. Omar Hasan Kasule Sr. MB ChB (MUK). MPH (Harvard), DrPH (Harvard) Professor of Epidemiology and Bioethics, King Fahad Medical City

 

LEARNING OBJECTIVES:

  • Concepts of dependent and independent variables in a scatter-gram.
  • Difference between correlation and regression.
  • The linear regression line and the linear regression equation (Y=a + bx).
  • Interpreting the intercept, a (+ve and -ve).
  • Interpreting the regression coefficient, b (+ve, -ve, range -infinity to +infinity).
  • Assumptions of linear regression for y: normality and homoscedasticity.
  • Use of the t-test for significance testing of b.
  • Use of the regression line/equation for testing of association.
  • Use of regression for prediction (interpolation and extrapolation).

 

DEFINITION OF SIMPLE LINEAR REGRESSION:

  • Regression to the mean, first described by Francis Galton (1822-1911) is one of the basic laws of nature, sunan al llah fi al kawn.
  • Regression relates independent with dependent variables.
  • The variables may be raw data (height and weight),  dummy indicator variables (1=male 2=female), or scores (GPA).
  • The simple linear regression equation is y=a + bx where:

      y is the dependent/response variable,

      a is the intercept,

      b is the slope/regression coefficient,

      x is the independent/predictor variable.

 

FIGURE OF A SIMPLE LINEAR REGRESSION:

Weight = 80 + 2 x (70) = 220 lbs

 

ASSUMPTIONS OF SIMPLE LINEAR REGRESSION:

  • Linearity of the x-y relation.
  • Normal distribution of the y variable for any given value of x.
  • Homoscedacity (constant y variance for all x values),
  • y values are independent for each value of x.

 

USE OF THE T-TEST TO TEST SIGNIFICANCE OF ‘B’ AND TESTING OF ASSOCIATION:

  • The t-test can be used to test the significance of the regression coefficient.
  • The t-test can be used to compare regression coefficients of 2 lines.
  • A significant ‘b’ significant relation between ‘x’ and ‘y’.

 

USE OF REGRESSION FOR PREDICTION (INTERPOLATION AND EXTRAPOLATION):

  • Simple linear regression is used in the prediction of y for given values of x.
  • Both linear extrapolation and linear interpolation are possible; the latter is more reliable. Extrapolation is the prediction for x values outside the range of the data on which the regression model was built.
  • Interpolation is prediction within the range of the data.
  • A hidden extrapolation is when an attempt is made to predict y within the range of the data but that value of y cannot occur in nature.
  • Difference between actual and predicted values.