






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An introduction to measures of center and spread in statistics, focusing on the mean, median, standard deviation, and quartiles. It explains how these measures help summarize a list of observations and their differences in handling outliers and skewed distributions. Examples are given to illustrate the concepts.
Typology: Study notes
1 / 10
This page cannot be seen from the preview
Don't miss anything!
Chapter 2 – Describing Distributions with Numbers Example : Phyllis had 6 assignment grades in her Stat2331 class: 86 88 92 44 89 90. Idea: We want a few numbers that can summarize a list of observations. Measures of Center I. Mean : Sum of all the observations divided by the number of observations in the list. II. Median calculate median:: The middle number when the observations are put in order. To a) sort observations from smallest to largestb) if n is odd (n = number of observations) median = middle value of the sorted list = n^ 2 +^1 th observation up from the bottom of the list c) ifn is even median = mean of middle two observations
Example : Phyllis grades (cont’d) Her mean grade is: Her median grade is: Issue: Which one to use? Q: Does the mean, 81.5, give a good idea of her “typical” grade? What about the median, 88.5?
Suppose we exclude the outlier 44. We have
With the outlier 44 Without the oulier 44 → resistant measure The mean is pulled by extreme observations or outliers. So it is of center. not a
→ The median is not pulled by the outliers. So it is a resistant measure of center. Q she scored but lost this info. Which would be more useful to him: In figuring Phyllis’ grade, her instructor needs to know total HW points
a) her mean score b) her median score
→ When the distribution is skewed: For “ typical ” value, use But if interested in total , use Issue : How do they relate to each other on various distributions? Symmetric:
Right-skewed:
Left-skewed:
Measures of Spread I. Standard deviation (s): measures the “average” distance between each observation and the mean of the data. To calculates: (a) Calculate • Compute squared distance between each observation and the mean of variance (s^2 ):
(b) Take square root of the variance to gets.
Ex : Water bills of two restaurants (cont’d). TGIF’s
Total Variance (s^2 ) = SD (s) = Similarly, forOutback’s we gets^2 = ______ ands = _____.
Behavior & Properties of Standard Deviation
Without the outlier 44 Ex : Choose 4 numbers from the list 0, 1, …, 10, repeats allowed, such that (a) they have the smallest possible SD.
(b) they have the largest possible SD.
II. Five-Number Summary : Uses five numbers to divide the whole distribution in four equal parts. One-quarter of all the observations are covered between two consecutive numbers.
We can graphically display the five-number summary by a boxplot. A boxplot consists of:
Boxplots of Home run data:
Notes : (1) Boxplots are best used for side-by-side comparison of more than onedistribution.
(2) Make sure that you include a numerical scale while drawing boxplot. (3) For skewed distributions, use five-number summary and boxplots. (4) Boxplots give an indication of the symmetry or skewness of a distribution. In a symmetric distribution, the first and third quartiles are equally distant from the median.
Example : Consider the box plot at the left, which summarizesthe systolic blood pressures of 39 adult males. Q: Estimate the 5-number summary from the boxplot. Min = Q 1 = Q 2 = Q 3 = Max = Fivedistribution. To get an idea about the spread only, we can use number summary and boxplot give us an idea about Inter-Quartile the whole Range (IQR):
Ex: Home Run data IQR for Ruth = IQR for Maris =
Use TI-83 to Calculate Numerical Summaries We will use the list Ldifferent list in the calculator. 1 for illustration, however, the following methods applied to
Step 1. Clear the data list press STAT Æ 4:ClrList Æ put L 1 by pressing 2nd 1 ( or 2nd STAT, and choose L 1 ) Step 2. Enter the data press STAT Æ 1:Edit Æ enter the observations one-by-one into the list L 1 Step 3. Obtain one-variable statistics press STAT press right arrow key to highlight CALC press ENTER for 1-Var Stats enter the name of the list containing your data press press 2nd 1 for L1ENTER