search this site.

130501P - DISCRETE DATA INFERENCE

Print Friendly and PDFPrint Friendly

Presentation at a Training Program on Biostatistics for physician managers working in Public Health Administration, Qassim Province on May 1, 2013 by Professor Omar HasanKasuleSr MB ChB (MUK), MPH (Harvard), DrPH (Harvard) EM: omarkasule@yahoo.com


APPROXIMATE vs EXACT METHODS 1 
  • Discrete data is analyzed as proportions. Inference on discrete data is based on the binomial/multinomial distribution. It uses 2 approximate methods (z-statistic and the chi-square) and one exact method (Fisher's Exact Method).The exact methods are parametric and are valid for large sample sizes. 
  • The exact methods are valid for both large and small sample sizes.
  • Approximate methods ate based on the large sample approximations that distributions approximate the normal distribution when the sample size is large enough. These methods are therefore not sufficiently accurate for small samples.

APPROXIMATE vs EXACT METHODS 21e size is large enough. These 
  • The z and chi-statistics are used for large samples. The exact methods are used for small samples. There is nothing to prevent exact methods from being used for large samples. Hypothesis testing for proportions is similar to that for means. 
  • The mean of a proportion can be taken to be p (1-p) and the variance can be taken to be n p (1-p). We can then proceed to use the same formulas as we used for the z-test.

DATA CHARACTERISTICS
  • Gaussian Distribution Of The Data: The first step is to ascertain whether the data distribution follows an approximate Gaussian distribution. The approximate methods are most valid when the data is Gaussian. 
  • Equal Variances: It is possible to compute variances for proportions using the binomial theorem. The variances of proportions in the compared samples must be approximately equal for the statistical tests to be valid.
  •  Adequacy of The Sample Size: For approximate methods to be valid, the sample size must be adequate. There are special statistical procedures for ascertaining sample size.

DATA LAY-OUT 
  • The data for approximate methods is laid out in the form of contingency tables: 2 x 2, 2 x k, m x n.. 
  • Visual inspection is recommended before application of statistical tests.

n11
n12
N1.
n21
n22
N2.
n.1
n.2
N

STATING THE HYPOTHESES 
  • The null hypothesis and the alternative hypotheses must be stated clearly. 
  • In inference on 1 proportion using z or chi-tests, H0: sample proportion - population proportion = 0. HA: sample mean > or < population proportion. 
  • In Inference on 2 sample proportions using z or chi-square test, H0: sample proportion #1 - sample proportion #2 = 0. HA: sample proportion #1 > or < sample proportion #2. 
  • The 2 sample hypothesis can alternatively be stated in terms of probability as H0: p11 = p21  and HA: p12 = p22.
  • In Inference on 3 or more sample means using the chi-square test: H0: sample proportion #1 = sample proportion  #2 = sample proportion #4 = sample proportion #…n

FIXING THE TESTING PARAMETERS and TEST STATISTIC 
  • For the p-value approach, the 5% or 0.05 level of significance is customarily used. There is nothing preventing using any other level like 2.5% or 10%. 
  • The chisquare is popular because it is easy to compute. The chisquare test is based on comparing expected with observed values. The chi-square statistic essentially measures the deviation from the 'average'. 
  • The Pearson chisquare statistic is summation over all cells of (observed - expected)2/expected. Each chisquare computed is associated with degrees of freedom computed as (number of rows – 1) (number of columns –1). The expected frequency is computed as the (row total x column total)/ grand total.

TESTING USING THE CHI SQUARE STATISTIC
  • Pearson Chisquare Of Association For Two Proportions In 2 X 2 Table
  • Mantel-Haenszel Chi-Square Of Association For Stratified 2 X 2 Tables
  • Mantel-Haenszel Chi-Square Of Homogeneity In Stratified 2 X 2 Tables
  • Matched Analysis Of Stratified 2 X 2 Tables

EXACT ANALYSIS OF PROPORTIONS
  • Nature Of Exact Methods
  • Testing 2 Sample Proportions In 2 X 2 Table
  • Testing In More Complex Tables

ANALYSIS OF RATES
  • Unstratified Analysis For One Rate
  • Unstratified Analysis Of Two Rates In A Simple 2 X 2 Table
  • Stratified Analysis For Rate Ratio In 2 X 2 Tables
  • Stratified Analysis For Rate Difference In 2 X 2 Tables

ANALYSIS OF RATIOS: ODDS RATIO AND RISK RATIO / RELATIVE RISK
  • Odds ratio in a simple 2 x 2 table
  • Mantel-haenszel odds ratio for stratified 2 x 2 tables
  • Methods for risk ratio and relative risk