







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The concept of variability in statistics, focusing on the range, standard deviation, and variance. It discusses why understanding variability is important, how to compute these measures, and their differences and similarities.
Typology: Summaries
1 / 13
This page cannot be seen from the preview
Don't miss anything!
Difficulty Scale ☺☺☺☺ (moderately easy, but not a cinch)
WHY UNDERSTANDING
VARIABILITY IS IMPORTANT
In Chapter 2, you learned about different types of averages, what they mean, how they are computed, and when to use them. But when it comes to descriptive statistics and describing the charac- teristics of a distribution, averages are only half the story. The other half is measures of variability. In the most simple of terms, vvaarriiaabbiilliittyy reflects how scores differ from one another. For example, the following set of scores shows some variability:
7, 6, 3, 3, 1
The following set of scores has the same mean (4) and has less variability than the previous set:
3, 4, 4, 5, 4
The next set has no variability at all—the scores do not differ from one another—but it also has the same mean as the other two sets we just showed you.
4, 4, 4, 4, 4
Variability (also called spread or dispersion) can be thought of as a measure of how different scores are from one another. It’s even more accurate (and maybe even easier) to think of variability as how different scores are from one particular score. And what “score” do you think that might be? Well, instead of comparing each score to every other score in a distribution, the one score that could be used as a comparison is—that’s right—the mean. So, vari- ability becomes a measure of how much each score in a group of scores differs from the mean. More about this in a moment. Remember what you already know about computing averages— that an average (whether it is the mean, the median, or the mode) is a representative score in a set of scores. Now, add your new knowledge about variability—that it reflects how different scores are from one another. Each is an important descriptive statistic. Together, these two (average and variability) can be used to describe the characteristics of a distribution and show how distributions dif- fer from one another. Three measures of variability are commonly used to reflect the degree of variability, spread, or dispersion in a group of scores. These are the range, the standard deviation, and the variance. Let’s take a closer look at each one and how each one is used.
COMPUTING THE RANGE
The range is the most general measure of variability. It gives you an idea of how far apart scores are from one another. The rraannggee is computed simply by subtracting the lowest score in a distribution from the highest score in the distribution. In general, the formula for the range is
r = h − l (3.1)
where r is the range h is the highest score in the data set l is the lowest score in the data set
36 Part II ♦ Σigma Freud and Descriptive Statistics
That’s a good idea—you’ll end up with the average distance of each score from the mean. But it won’t work (see if you know why even though we’ll show you why in a moment). First, here’s the formula for computing the standard deviation:
(3.2)
where
s is the standard deviation S is sigma, which tells you to find the sum of what follows X is each individual score X
This formula finds the difference between each individual score and the mean (X − X–), squares each difference, and sums them all together. Then, it divides the sum by the size of the sample (minus 1) and takes the square root of the result. As you can see, and as we mentioned earlier, the standard deviation is an average deviation from the mean. Here are the data we’ll use in the following step-by-step expla- nation of how to compute the standard deviation.
s =
√ ( X − X
)^2 n − 1
38 Part II ♦ Σigma Freud and Descriptive Statistics
X X^ −^ X − X^ − 8 6 8 − 6 = + 2 8 6 8 − 6 = + 2 8 6 8 − 6 = + 2 7 6 7 − 6 = + 1 6 6 6 − 6 = 0 6 6 6 − 6 = 0 5 6 5 − 6 = − 1 5 6 5 − 6 = − 1 4 6 4 − 6 = − 2 3 6 3 − 6 = − 3
X (X − X^ − ) (X − X^ − )^2 8 + 2 4 8 + 2 4 8 + 2 4 7 + 1 1 6 0 0 6 0 0 5 − 1 1 5 − 1 1 4 − 2 4 3 − 3 9 Sum 0 28
What we now know from these results is that each score in this distribution differs from the mean by an average of 1.76 points. Let’s take a short step back and examine some of the operations in the standard deviation formula. They’re important to review and will increase your understanding of what the standard deviation is. First, why didn’t we just add up the deviations from the mean? Because the sum of the deviations from the mean is always equal
Chapter 3 ♦ Vive la Différence 39
smaller denominator lets us do so. Thus, instead of dividing by 10, we divide by 9. Or instead of dividing by 100, we divide by 99.
Biased estimates are appropriate if your intent is only to describe the char- acteristics of the sample. But if you intend to use the sample as an estimate of a population parameter, then the unbiased statistic is best to calculate.
Take a look in the following table and see what happens as the size of the sample gets larger (and moves closer to the population in size). Then − 1 adjustment has far less of an impact on the difference between the biased and the unbiased estimates of the standard deviation (the bold column in the table). All other things being equal, then, the larger the size of the sample, the less of a dif- ference there is between the biased and the unbiased estimates of the standard deviation. Check out the following table, and you’ll see what we mean.
Chapter 3 ♦ Vive la Différence 41
Biased Estimate Unbiased Value of of the Estimate of the Difference Numerator in Population Population Between Standard Standard Standard Biased and Deviation Deviation Deviation Unbiased Sample Size Formula (dividing by n) (dividing by n – 1) Estimates 10 500 7.07 7.45. 100 500 2.24 2.25. 1,000 500 0.7071 0.7075.
The moral of the story? When you compute the standard devia- tion for a sample, which is an estimate of the population, the closer to the size of the population the sample is, the more accurate the estimate will be.
What’s the Big Deal?
The computation of the standard deviation is very straightfor- ward. But what does it mean? As a measure of variability, all it tells us is how much each score in a set of scores, on the average, varies from the mean. But it has some very practical applications, as you
will find out in Chapter 4. Just to whet your appetite, consider this: The standard deviation can be used to help us compare scores from different distributions,even when the means and standard deviations are different. Amazing! This, as you will see, can be very cool.
THINGS TO REMEMBER
COMPUTING THE VARIANCE
Here comes another measure of variability and a nice surprise. If you know the standard deviation of a set of scores and you can square a number, you can easily compute the variance of that same set of scores. This third measure of variability, the vvaarriiaannccee, is simply the standard deviation squared. In other words, it’s the same formula you saw earlier but without the square root bracket, like the one shown in Formula 3.3:
If you take the standard deviation and never complete the last step (taking the square root), you have the variance. In other words, s^2 =s × s, or the variance equals the standard deviation times itself
s^2 = ( X − X^ ---)^2 n − 1
42 Part II ♦ Σigma Freud and Descriptive Statistics
There is one variable in this data set:
Variable Definition ReactionTime Reaction time on a tapping task
Here are the steps to compute the measures of variability that we discussed in this chapter.
The SPSS Output
Figure 3.1 shows selected output from the SPSS procedure for ReactionTime. There are 30 valid cases with no missing cases, and the standard deviation is .70255. The variance equals .494 (ors^2 ), and the range is 2.60.
44 Part II ♦ Σigma Freud and Descriptive Statistics
Statistics Reaction Time N Valid 30 Missing 0 Std. Deviation. Variance. Range 2.
Figure 3.1 Output for the Variable ReactionTime
Let’s try another one, titled Chapter 3 Data Set 2. There are two variables in this data set:
Variable Definition MathScore Score on a mathematics test ReadingScore Score on a reading test
Follow the same set of instructions as given previously, only in Step 3, you select both variables. The SPSS output is shown in Figure 3.2, where you can see selected output from the SPSS proce- dure for these two variables. There are 30 valid cases with no miss- ing cases, and the standard deviation for math scores is 12.36 with a variance of 152.7 and a range of 43. For reading scores, the standard deviation is 18.700, the variance is a whopping 349.689 (that’s pretty big), and the range is 76 (which is large as well, reflecting the simi- larly large variance).
Chapter 3 ♦ Vive la Différence 45
Statistics
Math_Score Reading_Score N Valid 30 30 Missing 0 0 Std. Deviation 12.357 18. Variance 152.700 349. Range 43 76
Figure 3.2 Output for the Variables MathScore and ReadingScore
Measures of variability help us even more fully understand what a dis- tribution of data points looks like. Along with a measure of central ten- dency, we can use these values to distinguish distributions from one another and effectively describe what a collection of test scores, heights, or measures of personality looks like. Now that we can think and talk about distributions, let’s explore ways we can look at them.
Variable Definition Height height in inches Weight weight in pounds
Using SPSS, compute all of the measures of variability you can for height and weight.
Chapter 3 ♦ Vive la Différence 47