search this site.

250216 - 3.3 DISTRIBUTIONS OF CONTINUOUS VARIABLES

Print Friendly and PDFPrint Friendly

Presentation at the Course ‘Biostatistics MSN823’, Level 2 Year 1 Faculty of Nursing, Princess Noura bint Abdulrahman University. By Prof. Omar Hasan Kasule Sr. MB ChB (MUK), MPH (Harvard) DrPH (Harvard)

 

THE NORMAL DISTRIBUTION

  • Normal approximates the binomial with large n.
  • If many samples are drawn from a population, the statistics of these samples, such as the mean, follow a sampling distribution.
  • This distribution, according to the central limit theorem, becomes normal for a large enough sample size.

 

OTHER CONTINUOUS DISTRIBUTIONS: exponential, T, Chi-square

  • The exponential represents decay.
  • T distribution.
  • Chi-square distribution.
  • Each distribution has a formula with parameters to enable theoretical computation. Normal (m and s)

 

Figure: Normal Distribution

Normal Distribution (Bell Curve)


Figure: Exponential Distribution


 

Figure: T Distribution-1

Student t Distribution: Definition & Example - Video & Lesson Transcript |  Study.com

 

Figure: T Distribution-2

Statistics for Beginners in Excel - Basic Concepts of t Distribution

 

Chi-Square Distribution

How to Plot a Chi-Square Distribution in Python

 

F Distribition-1



F Distribition-2

 

AREAS UNDER THE NORMAL CURVE

  • z score, also called the critical ratio, is defined as (y - m) / s.
  • Area = probability.
  • Area can be computed if we know the z score: cumulative area from the left, area between two boundaries, area in the extremes, and p-value.
  • Critical value separates significant from insignificant z scores.
  • If the z-score is more than 3 or less than -3, then the observation is an outlier.
  • If the z score is more than +2 or less than -2, the distribution is not normal.
  • If kurtosis is greater than 3, the distribution is not normal.
  • Computation of probability using the area under the curve: The total area /probability of the curve = 1.0.
  • Area before a point, area after a point, and area between two points can be calculated using methods of integration or by using SPSS to generate the CDF.

 

COMPUTING Z SCORES AND FINDING AREA UNDER THE NORMAL CURVE USING SPSS

  • We assume that the data is normally distributed.
  • We compute a z-score for each data score
  • We compute a CDF for each data score
  • The CDF gives us the probability (area) at or below the given score
  • Alternatively, we can use the z score to look up the value of CDF from the Standard Unit Normal Table

 

Exercise:

  1. Convert the raw scores into z scores: Analyze > descriptive statistics > descriptive > click on the mean, std deviation, SE mean > click on save standardized values as variables > a new variable Height will be added at the end of the data base and it will have a mean of zero and a standard deviation of 1.0 > transform > compute variable > write name and type of variable as CDF and type as numerical > in the numeric expression window type CDF normal then open brackets and move Height followed by comma, zero, comma and 1 close the bracket >OK > a new variable CDF will be added to the data set.

 

FIGURE: THE STANDARD UNIT NORMAL TABLE

Z

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.0

0.0000

0.0040

0.0080

0.0120

0.0160

0.0199

0.0239

0.0279

0.0319

0.0359

0.1

0.0398

0.0438

0.0478

0.0517

0.0557

0.0596

0.0636

0.0675

0.0714

0.0753

0.2

0.0793

0.0832

0.0871

0.0910

0.0948

0.0987

0.1026

0.1064

0.1103

0.1141

0.3

0.1179

0.1217

0.1255

0.1293

0.1331

0.1368

0.1406

0.1443

0.1480

0.1517

0.4

0.1554

0.1591

0.1628

0.1664

0.1700

0.1736

0.1772

0.1808

0.1844

0.1879

0.5

0.1915

0.1950

0.1985

0.2019

0.2054

0.2088

0.2123

0.2157

0.2190

0.2224

0.6

0.2257

0.2291

0.2324

0.2357

0.2389

0.2422

0.2454

0.2486

0.2517

0.2549

0.7

0.2580

0.2611

0.2642

0.2673

0.2704

0.2734

0.2764

0.2794

0.2823

0.2852

0.8

0.2881

0.2910

0.2939

0.2967

0.2995

0.3023

0.3051

0.3078

0.3106

0.3133

0.9

0.3159

0.3186

0.3212

0.3238

0.3264

0.3289

0.3315

0.3340

0.3365

0.3389

1.0

0.3413

0.3438

0.3461

0.3485

0.3508

0.3531

0.3554

0.3577

0.3599

0.3621

1.1

0.3643

0.3665

0.3686

0.3708

0.3729

0.3749

0.3770

0.3790

0.3810

0.3830

1.2

0.3849

0.3869

0.3888

0.3907

0.3925

0.3944

0.3962

0.3980

0.3997

0.4015

1.3

0.4032

0.4049

0.4066

0.4082

0.4099

0.4115

0.4131

0.4147

0.4162

0.4177

1.4

0.4192

0.4207

0.4222

0.4236

0.4251

0.4265

0.4279

0.4292

0.4306

0.4319

1.5

0.4332

0.4345

0.4357

0.4370

0.4382

0.4394

0.4406

0.4418

0.4429

0.4441

1.6

0.4452

0.4463

0.4474

0.4484

0.4495

0.4505

0.4515

0.4525

0.4535

0.4545

1.7

0.4554

0.4564

0.4573

0.4582

0.4591

0.4599

0.4608

0.4616

0.4625

0.4633

1.8

0.4641

0.4649

0.4656

0.4664

0.4671

0.4678

0.4686

0.4693

0.4699

0.4706

1.9

0.4713

0.4719

0.4726

0.4732

0.4738

0.4744

0.4750

0.4756

0.4761

0.4767

2.0

0.4772

0.4778

0.4783

0.4788

0.4793

0.4798

0.4803

0.4808

0.4812

0.4817

2.1

0.4821

0.4826

0.4830

0.4834

0.4838

0.4842

0.4846

0.4850

0.4854

0.4857

2.2

0.4861

0.4864

0.4868

0.4871

0.4875

0.4878

0.4881

0.4884

0.4887

0.4890

2.3

0.4893

0.4896

0.4898

0.4901

0.4904

0.4906

0.4909

0.4911

0.4913

0.4916

2.4

0.4918

0.4920

0.4922

0.4925

0.4927

0.4929

0.4931

0.4932

0.4934

0.4936

2.5

0.4938

0.4940

0.4941

0.4943

0.4945

0.4946

0.4948

0.4949

0.4951

0.4952

2.6

0.4953

0.4955

0.4956

0.4957

0.4959

0.4960

0.4961

0.4962

0.4963

0.4964

2.7

0.4965

0.4966

0.4967

0.4968

0.4969

0.4970

0.4971

0.4972

0.4973

0.4974

2.8

0.4974

0.4975

0.4976

0.4977

0.4977

0.4978

0.4979

0.4979

0.4980

0.4981

2.9

0.4981

0.4982

0.4982

0.4983

0.4984

0.4984

0.4985

0.4985

0.4986

0.4986

3.0

0.4987

0.4987

0.4987

0.4988

0.4988

0.4989

0.4989

0.4989

0.4990

0.4990

 

Figure: Normal Curve Showing Area Under The Curve

Normal Distribution (Statistics) - The Ultimate Guide

 

Figure: Normal Curve Showing The Critical Value

Critical Values | CK-12 Foundation

 

ASSESSMENT OF NORMALITY

  • Inspect a histogram
  • Use a normal probability plot of x against y as a z-score
  • log normal plot by transforming into logs
  • Ryan-Joiner test

 

Exercises

  1. Test normality of age using a histogram:  Analyze > graphs > chart-builder > choose histogram > drag a histogram into the viewer window > drag height into the x axis > click display normal curve > OK
  2. Test the normality of weight using the normal probability plot
  3. Test the normality of weight using the log-normal plot
  4. Test normality of weight by the Ryan-Joiner test
  5. Test normality by the Kolmogorov-Smirnov….

 

NORMALIZING TRANSFORMATIONS

  • parameters (mean, standard deviation). Area under the curve, 95% CI, and the p-values (5%). Standard normal has parameters m and s. Skewness measures symmetry. Kurtosis measures proneness to outliers.

 

THE CENTRAL LIMIT THEOREM:

  • The mean of the means is equal to the population mean 2. The standard error of the mean =  s /Ö n 3. For a sufficiently large population, the sample means are normally distributed regardless of the distribution of the underlying population.  (Fig 4-10 page 76)

 

EXERCISES:

  • Test normality of age using a histogram
  • Test the normality of weight using 3 different methods
  • Carry out a log transformation of weight and test for normality
  • Automatically draw graphs of age, weight-m0, and time by Graphs > graph board template chooser > click age > choose the type of graph desired > click OK.
  • Compare subgroups by graphing: Graphs > Compare subgroups > subgroup defined by sex > Variable to plot age > OK

 

FIGURES:

  1. Normal curve with critical values and area under the curve
  2. Normal curve with standard normal distributions.
  3. Finding areas under the curve using normal distribution, figure 4-7, page 72.