Statistical Analysis of Discrete Variables: Chi-square Test and Fisher's Exact Test | Lecture notes Biostatistics

ANALYSIS OF DISCRETE VARIABLES / 25

CHAPTER FIVE

ANALYSIS OF DISCRETE VARIABLES

Discrete variables are those which can only assume certain fixed values. Examples include outcome

variables with results such as live vs die, pass vs fail, and extubated vs reintubated. Analysis of data obtained

from discrete variables requires the use of specific statistical tests which are different from those used to

assess continuous variables (such as cardiac output, blood pressure, or PaO2) which can assume an infinite

range of values. The analysis of continuous variables is discussed in the next chapter.

The two statistical tests which are most commonly used to analyze discrete variables are the chi-square

test (including the chi-square test with Yates’ correction) and Fisher’s exact test. Both of these tests are

based on the use of 2 x 2 contingency tables (Figure 5-1) which classify patients as either true positives,

true negatives, false positives, or false negatives with regard to their disease status and test outcome.

Disease Present Disease Absent

Test Positive

True Positive

False Positive

Test Negative

False Negative

True Negative

Figure 5-1: 2 x 2 Contingency Table

To use these two tests, we must first carefully define the disease being studied as well as the criteria

which constitute a positive test, assigning each patient to one of the four possible outcomes. Having created

a 2 x 2 contingency table of these results, the appropriate statistical test can be performed calculating the

critical value of the test which identifies whether a statistically significant difference exists between the two

groups of patients. The significance level associated with this critical value (more commonly referred to as the

p-value) can then be obtained from a chi-square distribution table to quantitate the significance of the

difference between the two groups.

CHI-SQUARE

The chi-square test is a statistical method for determining the approximate probability of whether the

results of an experiment may arise by chance or not. The test is performed by first creating a 2 x 2

contingency table of the observed disease and test outcome frequencies.

Disease No Disease

Test Positive a b (a + b)

Test Negative c d (c + d)

(a + c) (b + d) n

where: a = true positives, b = false positives, c = false

negatives, d = true negatives, n = total patients

If the null hypothesis is true (the test does not discriminate between patients with the disease and

patients without the disease), we would expect the disease frequencies to be equally distributed based on the

probabilities of a positive and a negative test result. Since the frequency of an event is given by the probability

of the event multiplied by the number of events, the expected frequency of diseased patients with a positive

test result (i.e., true positives or the frequency in cell “a”) is:

expected true positives = probability of disease x probability of a positive test x n

Statistical Analysis of Discrete Variables: Chi-square Test and Fisher's Exact Test, Lecture notes of Biostatistics

Related documents

Partial preview of the text

Download Statistical Analysis of Discrete Variables: Chi-square Test and Fisher's Exact Test and more Lecture notes Biostatistics in PDF only on Docsity!

A NALYSIS OF D ISCRETE VARIABLES / 25

C HAPTER FIVE

A NALYSIS OF D ISCRETE VARIABLES

26 / A PRACTICAL G UIDE TO B IOSTATISTICS

×

×

O E

E

O E

E

O E

E

O E

E

Χ^2

28 / A PRACTICAL G UIDE TO B IOSTATISTICS

2

P

P

A NALYSIS OF D ISCRETE VARIABLES / 29

SUGGESTED R EADING