












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
This is a worksheet from a mathematics class
Typology: Cheat Sheet
1 / 20
This page cannot be seen from the preview
Don't miss anything!
Other
(a) Describe the important features of this distribution. What does this tell you about how the economic burden of tornados is distributed among the states? (b) When asked for summary statistics, MINITAB produced the following output: Variable N Mean SE Mean StDev Minimum Q1 Median Q3 Maximum Damage 50 22.39 3.60 25.45 0.00 2.23 12.66 41.63 88. Give the five-number summary, and explain why you can see from these five numbers that the distribution is strongly skewed to the right. (c) The histogram suggests that there may be outliers. Use the 1.5 x IQR rule of thumb to show that no values in this distribution meet this criterion for outliers.
(a) Write the equation of the least-squares regression line. (b) Interpret the value “R-sq = 27.7%.” (c) Predict the fastest serve of a professional tennis player whose is 1.7 meters tall. Comment on the reliability of this prediction.
Minimum Q1 Median Q3 Maximum 65.00 84.00 89.00 93.00 96. (a) Write a brief description of the distribution of scores on this exam. (b) Would you estimate that the mean is about the same as the median, higher than the median, or lower than the median? Justify your answer. (c) Mr. Stouffer decides that because the test was quite difficult, he will add 5 points to the score of any low outliers. Do any test scores qualify for this bonus? Justify your answer with appropriate calculations. Multiple Choice Identify the choice that best completes the statement or answers the question. ____ 1. At the beginning of the school year, a high-school teacher asks every student in her classes to fill out a survey that asks for their age, gender, the number of years they have lived at their current address, their favorite school subject, and whether they plan to go to college after high school. Which of the following best describes the variables that are being measured? a. (^) four quantitative variables b. (^) five quantitative variables c. (^) two categorical variables and two quantitative variables d. (^) two categorical variables and three quantitative variables e. (^) three categorical variables and two quantitative variables ____ 2. The median age of five people in a meeting is 30 years. One of the people—a 50-year-old—leaves the room. The median age of the remaining four people in the room is
b. c. d.
e. ____ 4. A researcher reports that the participants in his study lost a mean of 10.4 pounds after two months on his new diet. A friend of yours comments that she tried the diet for two months and lost no weight, so clearly the report was a fraud. Which of the following statements is correct? a. (^) Your friend must not have followed the diet correctly, since she did not lose weight. b. Since your friend did not lose weight, the report must not be correct. c. (^) The report gives only the mean. This does not imply that all participants in the study lost 10.4 pounds or even that all lost weight. Your friend’s experience does not necessarily contradict the study results. d. (^) In order for the study to be correct, we must now add your friend’s results to those of the study and recalculate the new average. e. (^) Your friend is an outlier. ____ 5. The following is a histogram showing the actual frequency of the closing prices of a particular stock on the New York Stock Exchange over a 50-day period. The class that contains the third quartile is a. 10- b. 20-
a. (^) The mean lead level in the water is about 10 ppm. b. (^) The mean lead level in the water is about 9 ppm. c. The median lead level in the water is 7 ppm. d. (^) The median lead level in the water is 8 ppm. e. (^) Neither the mean nor the median can be computed because some values are unknown. ____ 10. Which of these variables is least likely to have a Normal distribution? a. (^) Annual income for all 150 employees at a local high school b. (^) Lengths of 50 newly hatched pythons c. (^) Heights of 100 white pine trees in a forest d. (^) Amount of soda in 60 cups filled by an automated machine at a fast-food restaurant e. (^) Weights of 200 of the same candy bar in a shipment to a local supermarket ____ 11. The proportion of observations from a standard Normal distribution that take values larger than is about a. 0. b. 0. c. 0. d. 0. e. 0. ____ 12. The density curve shown to the right takes the value 0.5 on the interval 0 + x + 2 and takes the value 0 everywhere else. What percent of the observations lie between 0.5 and 1.2? a. 25% b. 35% c. 50% d. 68% e. 70% ____ 13. The distribution of the heights of students in a large class is roughly Normal. Moreover, the average height is 68 inches, and approximately 95% of the heights are between 62 and 74 inches. Thus, the standard deviation of the height distribution is approximately equal to a. 2 b. 3 c. 6 d. 9
e. 12 ____ 14. If a store runs out of advertised material during a sale, customers become upset, and the store loses not only the sale but also goodwill. From past experience, a music store finds that the mean number of CDs sold in a sale is 845, the standard deviation is 15, and a histogram of the demand is approximately Normal. The manager is willing to accept a 2.5% chance that a CD will be sold out. About how many CDs should the manager order for an upcoming sale? a. 1295 b. 1070 c. 935 d. 875 e. 860 ____ 15. If your score on a test is at the 60th percentile, you know that your score lies a. (^) below the first quartile. b. (^) between the first quartile and the median. c. (^) between the median and the third quartile. d. (^) above the third quartile. e. (^) There is not enough information to say where it lies relative to the quartiles. ____ 16. In some courses (but certainly not in an intro stats course!), students are graded on a “Normal curve.” For example, students between 0 and 0.5 standard deviations above the mean receive a C+; between 0.5 and 1.0 standard deviations above the mean receive a B –; between 1.0 and 1.5 standard deviations above the mean receive a B; between 1.5 and 2.0 standard deviations above the mean receive a B+, etc. The class average on an exam was 60 with a standard deviation of 10.Which of the following is bounds for a B and the percent of students who will receive a B if the marks are actually Normally distributed? a. (^) (65, 75), 24.17% b. (^) (65, 75), 12.08% c. (^) (70, 75), 18.38% d. (^) (70, 75), 9.19% e. (70, 75), 6.68% ____ 17. The mean age (at inauguration) of all U.S. Presidents is approximately Normally distributed with a mean of 54.6. Barack Obama was 47 when he was inaugurated, which is the 11th^ percentile of the distribution. Which of the following is closest to the standard deviation of presidents’ ages? a. -9. b. -6. c. 6. d. 7. e. 9. ____ 18. Which of the following is true about all Normal distributions? a. (^) There is no area under the curve for z -scores greater than 3.49. b. (^) The area underneath the curve is equal to 1.
about 116. d. (^) The correlation between amount of fat and calories is positive. e. If one cereal has 140 calories and 5 g of fat. Its residual is about 5 calories. ____ 23. A copy machine dealer has data on the number of copy machines x at each of 89 customer locations and the number of service calls in a month y at each location. Summary calculations give = 8.4, = 2.1, = 14.2, = 3.8, and r = 0.86. What is the slope of the least-squares regression line of number of service calls on number of copiers? a. 0. b. 1. c. 0. d. 2. e. Can’t tell from the information given ____ 24. In the setting of the previous problem, about what percent of the variation in the number of service calls is explained by the linear relation between number of service calls and number of machines? a. 86% b. 93% c. 74% d. 55% e. Can’t tell from the information given ____ 25. A study examined the relationship between the sepal length and sepal width for two varieties of an exotic tropical plant. Varieties X and O are represented by x’s and o’s, respectively, in the following scatterplot. Which of the following statements is true? a. (^) Considering Variety X only, there is a positive correlation between sepal length and width. b. (^) Considering Variety O only, the least-squares regression line for predicting sepal length from sepal width has a positive slope. c. (^) Considering both varieties together, there is a negative correlation between sepal length and width. d. (^) Considering each variety separately, there is a negative correlation between sepal
length and width e. (^) Considering both varieties together, the least-squares regression line for predicting sepal length from sepal width has a negative slope. ____ 26. Suppose we fit a least-squares regression line to a set of data. What is true if a plot of the residuals shows a curved pattern? a. (^) A straight line is not a good model for the data. b. (^) The correlation must be 0. c. (^) The correlation must be positive d. Outliers must be present e. (^) The regression line might or might not be a good model for the data, depending on the extent of the curve. ____ 27. Mr. Nerdly asked the students in his AP Statistics class to report their overall grade point averages and their SAT Math scores. The scatterplot below provides information about his students’ data. The dark line is the least-squares regression line for the data, and its equation is. Which of the following statements about the circled point is true? a. (^) The standard score for this student’s GPA is positive. b. (^) If we used the least-squares line to predict this student’s SAT Math score, we would make a prediction that is too low. c. (^) This student’s residual is positive d. (^) Removing this data point would not change the correlation between SAT math score and GPA. e. Removing this student’s data point would decrease the slope of the least-squares line.
____ 30. A forester studying oak trees finds that the correlation between x = the ages (measured in years) and y = height (in feet) of a sample of trees is 0.78. Which of the following statements must be true? a. (^) 78% of the variability in tree heights can be explained by variation on the trees’ ages. b. (^) For every year a tree ages, it’s height increases, on average, by 78%. c. (^) If we let x = height of tree and y = age of tree, then the correlation would be the reciprocal of 0.78. d. (^) If we measure the height in meters instead of feet, the correlation would still be 0.78. e. (^) The unit for correlation in this context is foot-years. The computer output below predicts air fare for a discount airline’s flights from Philadelphia to 13 different cities from the flight’s length in miles. A scatterplot of the data suggests that the relationship is roughly linear. Predictor Coef SE Coef T P Constant 101.24 13.49 7.50 0. Distance 0.02977 0.01107 2.69 0. S = 20.7237 R-Sq = 39.7% R-Sq(adj) = 34.2% ____ 31. One flight—Philadelphia to West Palm Beach, FL—is 953 miles long and costs $110. Which of the following expressions correctly represents the residual for this data point? a. b. c. d. e. ____ 32. Which of the following best describes what S = 20.7237 measures? a. (^) The standard deviation of air fares b. (^) The standard deviation of flight distances c. (^) The standard deviation of the residuals d. (^) The slope of the least squares regression line e. (^) The standard deviation of the slope The scatterplot at right shows the relationship between carbohydrates and protein in one-cup servings of 15 different varieties of beans.
____ 33. The unusual point in the upper left part of the plot is for navy beans, with 15.8 grams of protein and 15.8 grams of carbohydrates. Which of the following best describes how correlation would change if we removed navy beans from the data set? a. (^) The correlation would be closer to 1, because the remaining data would have a stronger positive relationship. b. (^) The correlation would be closer to 1, because there would be fewer individuals in the data set. c. (^) The correlation would be closer to 0, because the data would more closely resemble a straight line. d. (^) The correlation would be closer to 0, because the standard deviation of the residuals would be smaller. e. (^) Correlation could no longer be calculated, because the remaining data would fall into two distinct groups ____ 34. The protein content for the 15 bean varieties has a mean of 12.2 grams and a standard deviation of 5.3 grams. The mean carbohydrate content is 33.6 grams with a standard deviation of 15.7 grams. The correlation is 0.84. Which of the following expressions represents the slope of the least squares regression of y = protein content on x = carbohydrate content? a. b. c. d. e.
____ 39. 1200 tomatoes have a mean weight of 143 grams and a standard deviation of 35 grams. If the weights are Normally distributed, approximately how many tomatoes weight between 73 grams and 178 grams? a. 384 b. 600 c. 816 d. 978 e. 1140