Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Variability: Range, Standard Deviation, and Variance, Summaries of Statistics

The concept of variability in statistics, focusing on the range, standard deviation, and variance. It discusses why understanding variability is important, how to compute these measures, and their differences and similarities.

Typology: Summaries

2021/2022

Uploaded on 09/27/2022

magicphil
magicphil 🇺🇸

4.3

(16)

241 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Vive la Différence
Understanding Variability
Difficulty Scale ☺☺☺☺ (moderately easy, but not a cinch)
What you’ll learn about in this chapter
Why variability is valuable as a descriptive tool
How to compute the range, standard deviation, and variance
How the standard deviation and variance are alike—and how
they are different
WHY UNDERSTANDING
VARIABILITY IS IMPORTANT
In Chapter 2, you learned about different types of averages, what
they mean, how they are computed, and when to use them. But
when it comes to descriptive statistics and describing the charac-
teristics of a distribution, averages are only half the story. The other
half is measures of variability.
In the most simple of terms, vvaarriiaabbiilliittyyreflects how scores differ
from one another. For example, the following set of scores shows
some variability:
7, 6, 3, 3, 1
The following set of scores has the same mean (4) and has less
variability than the previous set:
3, 4, 4, 5, 4
35
3
03-Salkind (Statistics).qxd 7/7/2007 11:27 AM Page 35
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Understanding Variability: Range, Standard Deviation, and Variance and more Summaries Statistics in PDF only on Docsity!

Vive la Différence

Understanding Variability

Difficulty Scale ☺☺☺☺ (moderately easy, but not a cinch)

What you’ll learn about in this chapter

  • Why variability is valuable as a descriptive tool
  • How to compute the range, standard deviation, and variance
  • How the standard deviation and variance are alike—and how they are different

WHY UNDERSTANDING

VARIABILITY IS IMPORTANT

In Chapter 2, you learned about different types of averages, what they mean, how they are computed, and when to use them. But when it comes to descriptive statistics and describing the charac- teristics of a distribution, averages are only half the story. The other half is measures of variability. In the most simple of terms, vvaarriiaabbiilliittyy reflects how scores differ from one another. For example, the following set of scores shows some variability:

7, 6, 3, 3, 1

The following set of scores has the same mean (4) and has less variability than the previous set:

3, 4, 4, 5, 4

The next set has no variability at all—the scores do not differ from one another—but it also has the same mean as the other two sets we just showed you.

4, 4, 4, 4, 4

Variability (also called spread or dispersion) can be thought of as a measure of how different scores are from one another. It’s even more accurate (and maybe even easier) to think of variability as how different scores are from one particular score. And what “score” do you think that might be? Well, instead of comparing each score to every other score in a distribution, the one score that could be used as a comparison is—that’s right—the mean. So, vari- ability becomes a measure of how much each score in a group of scores differs from the mean. More about this in a moment. Remember what you already know about computing averages— that an average (whether it is the mean, the median, or the mode) is a representative score in a set of scores. Now, add your new knowledge about variability—that it reflects how different scores are from one another. Each is an important descriptive statistic. Together, these two (average and variability) can be used to describe the characteristics of a distribution and show how distributions dif- fer from one another. Three measures of variability are commonly used to reflect the degree of variability, spread, or dispersion in a group of scores. These are the range, the standard deviation, and the variance. Let’s take a closer look at each one and how each one is used.

COMPUTING THE RANGE

The range is the most general measure of variability. It gives you an idea of how far apart scores are from one another. The rraannggee is computed simply by subtracting the lowest score in a distribution from the highest score in the distribution. In general, the formula for the range is

r = h − l (3.1)

where r is the range h is the highest score in the data set l is the lowest score in the data set

36 Part II ♦ Σigma Freud and Descriptive Statistics

That’s a good idea—you’ll end up with the average distance of each score from the mean. But it won’t work (see if you know why even though we’ll show you why in a moment). First, here’s the formula for computing the standard deviation:

(3.2)

where

s is the standard deviation S is sigma, which tells you to find the sum of what follows X is each individual score X

  • is the mean of all the scores n is the sample size.

This formula finds the difference between each individual score and the mean (X − X–), squares each difference, and sums them all together. Then, it divides the sum by the size of the sample (minus 1) and takes the square root of the result. As you can see, and as we mentioned earlier, the standard deviation is an average deviation from the mean. Here are the data we’ll use in the following step-by-step expla- nation of how to compute the standard deviation.

  1. List each score. It doesn’t matter whether the scores are in any par- ticular order.
  2. Compute the mean of the group.
  3. Subtract the mean from each score. Here’s what we’ve done so far, where X − −X^ represents the differ- ence between the actual score and the mean of all the scores, which is 6.

s =

√ ( XX


)^2 n − 1

38 Part II ♦ Σigma Freud and Descriptive Statistics

X X^ −^ XX^ − 8 6 8 − 6 = + 2 8 6 8 − 6 = + 2 8 6 8 − 6 = + 2 7 6 7 − 6 = + 1 6 6 6 − 6 = 0 6 6 6 − 6 = 0 5 6 5 − 6 = − 1 5 6 5 − 6 = − 1 4 6 4 − 6 = − 2 3 6 3 − 6 = − 3

  1. Square each individual difference. The result is the column marked ( X^ −^ X^ −)^2.

X (XX^ − ) (XX^ − )^2 8 + 2 4 8 + 2 4 8 + 2 4 7 + 1 1 6 0 0 6 0 0 5 − 1 1 5 − 1 1 4 − 2 4 3 − 3 9 Sum 0 28

  1. Sum all the squared deviations about the mean. As you can see, the total is 28.
  2. Divide the sum by n −1, or 10 − 1 = 9, so then 28/9 = 3.11.
  3. Compute the square root of 3.11, which is 1.76 (after rounding). That is the standard deviation for this set of 10 scores.

What we now know from these results is that each score in this distribution differs from the mean by an average of 1.76 points. Let’s take a short step back and examine some of the operations in the standard deviation formula. They’re important to review and will increase your understanding of what the standard deviation is. First, why didn’t we just add up the deviations from the mean? Because the sum of the deviations from the mean is always equal

Chapter 3 ♦ Vive la Différence 39

smaller denominator lets us do so. Thus, instead of dividing by 10, we divide by 9. Or instead of dividing by 100, we divide by 99.

Biased estimates are appropriate if your intent is only to describe the char- acteristics of the sample. But if you intend to use the sample as an estimate of a population parameter, then the unbiased statistic is best to calculate.

Take a look in the following table and see what happens as the size of the sample gets larger (and moves closer to the population in size). Then − 1 adjustment has far less of an impact on the difference between the biased and the unbiased estimates of the standard deviation (the bold column in the table). All other things being equal, then, the larger the size of the sample, the less of a dif- ference there is between the biased and the unbiased estimates of the standard deviation. Check out the following table, and you’ll see what we mean.

Chapter 3 ♦ Vive la Différence 41

Biased Estimate Unbiased Value of of the Estimate of the Difference Numerator in Population Population Between Standard Standard Standard Biased and Deviation Deviation Deviation Unbiased Sample Size Formula (dividing by n) (dividing by n – 1) Estimates 10 500 7.07 7.45. 100 500 2.24 2.25. 1,000 500 0.7071 0.7075.

The moral of the story? When you compute the standard devia- tion for a sample, which is an estimate of the population, the closer to the size of the population the sample is, the more accurate the estimate will be.

What’s the Big Deal?

The computation of the standard deviation is very straightfor- ward. But what does it mean? As a measure of variability, all it tells us is how much each score in a set of scores, on the average, varies from the mean. But it has some very practical applications, as you

will find out in Chapter 4. Just to whet your appetite, consider this: The standard deviation can be used to help us compare scores from different distributions,even when the means and standard deviations are different. Amazing! This, as you will see, can be very cool.

THINGS TO REMEMBER

  • The standard deviation is computed as the average distance from the mean. So, you will need to first compute the mean as a mea- sure of central tendency. Don’t fool around with the median or the mode in trying to compute the standard deviation.
  • The larger the standard deviation, the more spread out the val- ues are, and the more different they are from one another.
  • Just like the mean, the standard deviation is sensitive to extreme scores. When you are computing the standard deviation of a sample and you have extreme scores, note that somewhere in your written report.
  • Ifs = 0, there is absolutely no variability in the set of scores, and the scores are essentially identical in value. This will rarely happen.

COMPUTING THE VARIANCE

Here comes another measure of variability and a nice surprise. If you know the standard deviation of a set of scores and you can square a number, you can easily compute the variance of that same set of scores. This third measure of variability, the vvaarriiaannccee, is simply the standard deviation squared. In other words, it’s the same formula you saw earlier but without the square root bracket, like the one shown in Formula 3.3:

If you take the standard deviation and never complete the last step (taking the square root), you have the variance. In other words, s^2 =s × s, or the variance equals the standard deviation times itself

s^2 = ( XX^ ---)^2 n − 1

42 Part II ♦ Σigma Freud and Descriptive Statistics

There is one variable in this data set:

Variable Definition ReactionTime Reaction time on a tapping task

Here are the steps to compute the measures of variability that we discussed in this chapter.

  1. Open the file named Chapter 3 Data Set 1.
  2. Click Analyze → Descriptive Statistics → Frequencies.
  3. Double-click on the ReactionTime variable to move it to the Variable(s) box.
  4. Click Statistics, and you will see the Frequencies: Statistics dialog box. Use this dialog box to select the variables and pro- cedures you want to perform.
  5. Under Dispersion, click Std. Deviation.
  6. Under Dispersion, click Variance.
  7. Under Dispersion, click Range.
  8. Click Continue.
  9. Click OK.

The SPSS Output

Figure 3.1 shows selected output from the SPSS procedure for ReactionTime. There are 30 valid cases with no missing cases, and the standard deviation is .70255. The variance equals .494 (ors^2 ), and the range is 2.60.

44 Part II ♦ Σigma Freud and Descriptive Statistics

Statistics Reaction Time N Valid 30 Missing 0 Std. Deviation. Variance. Range 2.

Figure 3.1 Output for the Variable ReactionTime

Let’s try another one, titled Chapter 3 Data Set 2. There are two variables in this data set:

Variable Definition MathScore Score on a mathematics test ReadingScore Score on a reading test

Follow the same set of instructions as given previously, only in Step 3, you select both variables. The SPSS output is shown in Figure 3.2, where you can see selected output from the SPSS proce- dure for these two variables. There are 30 valid cases with no miss- ing cases, and the standard deviation for math scores is 12.36 with a variance of 152.7 and a range of 43. For reading scores, the standard deviation is 18.700, the variance is a whopping 349.689 (that’s pretty big), and the range is 76 (which is large as well, reflecting the simi- larly large variance).

Chapter 3 ♦ Vive la Différence 45

Statistics

Math_Score Reading_Score N Valid 30 30 Missing 0 0 Std. Deviation 12.357 18. Variance 152.700 349. Range 43 76

Figure 3.2 Output for the Variables MathScore and ReadingScore

Summary

Measures of variability help us even more fully understand what a dis- tribution of data points looks like. Along with a measure of central ten- dency, we can use these values to distinguish distributions from one another and effectively describe what a collection of test scores, heights, or measures of personality looks like. Now that we can think and talk about distributions, let’s explore ways we can look at them.

Time to Practice

  1. Why is the range the most convenient measure of dispersion, yet the most imprecise measure of variability? When would you use the range?
  1. The variance for a set of scores is 25. What is the standard devia- tion and what is the range?
  2. This practice problem uses the data contained in the file named Chapter 3 Data Set 3. There are two variables in this data set.

Variable Definition Height height in inches Weight weight in pounds

Using SPSS, compute all of the measures of variability you can for height and weight.

  1. How can you tell whether SPSS produces a biased or an unbiased estimate of the standard deviation?.

Chapter 3 ♦ Vive la Différence 47