Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Mathematics Math/Math Stats, Cheat Sheet of Mathematics

This is a worksheet from a mathematics class

Typology: Cheat Sheet

2021/2022

Uploaded on 11/05/2022

JohnnyLTree
JohnnyLTree 🇺🇸

2 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Introduction to Statistics and Probability Review problems
Other
1. For each of the following situations involving sampling, identify—as precisely as possible—the   
population that the sample represents.
(a) A business school researcher wants to know what factors affect the survival and success of small
businesses.    She selects a sample of 150 eating-and-drinking establishments from those listed in the
telephone directory for a large city.
(b) A member of Congress wants to know whether his constituents support proposed legislation on
health care.    His staff reports that 228 letters have been received on the subject, of which 193
oppose the legislation.
2. A local radio talk-show host asks viewers to call in and vote for or against a proposed plan to raise
the prices charged by municipal parking meters in a downtown shopping district.    75% of the
respondents are opposed to the increase.    Describe one possible source of error or bias that might
arise in this poll and indicate the direction in which the estimate might be biased.      What is the
name for this kind of bias?
3. The states differ greatly in the kinds of severe weather that afflict them.    The histogram below shows the
distribution of average annual property damage caused by tornadoes over the period from 1950 to 1999 in the
50 United States. (To adjust for the changing buying power of the dollar over time, all damages were restated
in 1999 dollars.)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Mathematics Math/Math Stats and more Cheat Sheet Mathematics in PDF only on Docsity!

Introduction to Statistics and Probability Review problems

Other

  1. For each of the following situations involving sampling, identify—as precisely as possible—the population that the sample represents. (a) A business school researcher wants to know what factors affect the survival and success of small businesses. She selects a sample of 150 eating-and-drinking establishments from those listed in the telephone directory for a large city. (b) A member of Congress wants to know whether his constituents support proposed legislation on health care. His staff reports that 228 letters have been received on the subject, of which 193 oppose the legislation.
  2. A local radio talk-show host asks viewers to call in and vote for or against a proposed plan to raise the prices charged by municipal parking meters in a downtown shopping district. 75% of the respondents are opposed to the increase. Describe one possible source of error or bias that might arise in this poll and indicate the direction in which the estimate might be biased. What is the name for this kind of bias?
  3. The states differ greatly in the kinds of severe weather that afflict them. The histogram below shows the distribution of average annual property damage caused by tornadoes over the period from 1950 to 1999 in the 50 United States. (To adjust for the changing buying power of the dollar over time, all damages were restated in 1999 dollars.)

(a) Describe the important features of this distribution. What does this tell you about how the economic burden of tornados is distributed among the states? (b) When asked for summary statistics, MINITAB produced the following output: Variable N Mean SE Mean StDev Minimum Q1 Median Q3 Maximum Damage 50 22.39 3.60 25.45 0.00 2.23 12.66 41.63 88. Give the five-number summary, and explain why you can see from these five numbers that the distribution is strongly skewed to the right. (c) The histogram suggests that there may be outliers. Use the 1.5 x IQR rule of thumb to show that no values in this distribution meet this criterion for outliers.

  1. Below are the lengths, in minutes, of the 25 most popular movies, as voted on by visitors to a web site devoted to movies. 142 195 201 139 106 175 96 121 130 109 200 133 102 178 165 161 124 146 115 118 154 152 207 112 136 (a) Make a stemplot of these data. (b) The mean of this distribution is 144.7 and the median is 139. Explain why you could have predicted that the mean would be higher than the median by looking at the stemplot.

(a) Write the equation of the least-squares regression line. (b) Interpret the value “R-sq = 27.7%.” (c) Predict the fastest serve of a professional tennis player whose is 1.7 meters tall. Comment on the reliability of this prediction.

  1. The dotplot and computer output below show final exam scores for Mr. Stouffer’s 25 calculus students.

Minimum Q1 Median Q3 Maximum 65.00 84.00 89.00 93.00 96. (a) Write a brief description of the distribution of scores on this exam. (b) Would you estimate that the mean is about the same as the median, higher than the median, or lower than the median? Justify your answer. (c) Mr. Stouffer decides that because the test was quite difficult, he will add 5 points to the score of any low outliers. Do any test scores qualify for this bonus? Justify your answer with appropriate calculations. Multiple Choice Identify the choice that best completes the statement or answers the question. ____ 1. At the beginning of the school year, a high-school teacher asks every student in her classes to fill out a survey that asks for their age, gender, the number of years they have lived at their current address, their favorite school subject, and whether they plan to go to college after high school. Which of the following best describes the variables that are being measured? a. (^) four quantitative variables b. (^) five quantitative variables c. (^) two categorical variables and two quantitative variables d. (^) two categorical variables and three quantitative variables e. (^) three categorical variables and two quantitative variables ____ 2. The median age of five people in a meeting is 30 years. One of the people—a 50-year-old—leaves the room. The median age of the remaining four people in the room is

b. c. d.

e. ____ 4. A researcher reports that the participants in his study lost a mean of 10.4 pounds after two months on his new diet. A friend of yours comments that she tried the diet for two months and lost no weight, so clearly the report was a fraud. Which of the following statements is correct? a. (^) Your friend must not have followed the diet correctly, since she did not lose weight. b. Since your friend did not lose weight, the report must not be correct. c. (^) The report gives only the mean. This does not imply that all participants in the study lost 10.4 pounds or even that all lost weight. Your friend’s experience does not necessarily contradict the study results. d. (^) In order for the study to be correct, we must now add your friend’s results to those of the study and recalculate the new average. e. (^) Your friend is an outlier. ____ 5. The following is a histogram showing the actual frequency of the closing prices of a particular stock on the New York Stock Exchange over a 50-day period. The class that contains the third quartile is a. 10- b. 20-

a. (^) The mean lead level in the water is about 10 ppm. b. (^) The mean lead level in the water is about 9 ppm. c. The median lead level in the water is 7 ppm. d. (^) The median lead level in the water is 8 ppm. e. (^) Neither the mean nor the median can be computed because some values are unknown. ____ 10. Which of these variables is least likely to have a Normal distribution? a. (^) Annual income for all 150 employees at a local high school b. (^) Lengths of 50 newly hatched pythons c. (^) Heights of 100 white pine trees in a forest d. (^) Amount of soda in 60 cups filled by an automated machine at a fast-food restaurant e. (^) Weights of 200 of the same candy bar in a shipment to a local supermarket ____ 11. The proportion of observations from a standard Normal distribution that take values larger than is about a. 0. b. 0. c. 0. d. 0. e. 0. ____ 12. The density curve shown to the right takes the value 0.5 on the interval 0 + x + 2 and takes the value 0 everywhere else. What percent of the observations lie between 0.5 and 1.2? a. 25% b. 35% c. 50% d. 68% e. 70% ____ 13. The distribution of the heights of students in a large class is roughly Normal. Moreover, the average height is 68 inches, and approximately 95% of the heights are between 62 and 74 inches. Thus, the standard deviation of the height distribution is approximately equal to a. 2 b. 3 c. 6 d. 9

e. 12 ____ 14. If a store runs out of advertised material during a sale, customers become upset, and the store loses not only the sale but also goodwill. From past experience, a music store finds that the mean number of CDs sold in a sale is 845, the standard deviation is 15, and a histogram of the demand is approximately Normal. The manager is willing to accept a 2.5% chance that a CD will be sold out. About how many CDs should the manager order for an upcoming sale? a. 1295 b. 1070 c. 935 d. 875 e. 860 ____ 15. If your score on a test is at the 60th percentile, you know that your score lies a. (^) below the first quartile. b. (^) between the first quartile and the median. c. (^) between the median and the third quartile. d. (^) above the third quartile. e. (^) There is not enough information to say where it lies relative to the quartiles. ____ 16. In some courses (but certainly not in an intro stats course!), students are graded on a “Normal curve.” For example, students between 0 and 0.5 standard deviations above the mean receive a C+; between 0.5 and 1.0 standard deviations above the mean receive a B –; between 1.0 and 1.5 standard deviations above the mean receive a B; between 1.5 and 2.0 standard deviations above the mean receive a B+, etc. The class average on an exam was 60 with a standard deviation of 10.Which of the following is bounds for a B and the percent of students who will receive a B if the marks are actually Normally distributed? a. (^) (65, 75), 24.17% b. (^) (65, 75), 12.08% c. (^) (70, 75), 18.38% d. (^) (70, 75), 9.19% e. (70, 75), 6.68% ____ 17. The mean age (at inauguration) of all U.S. Presidents is approximately Normally distributed with a mean of 54.6. Barack Obama was 47 when he was inaugurated, which is the 11th^ percentile of the distribution. Which of the following is closest to the standard deviation of presidents’ ages? a. -9. b. -6. c. 6. d. 7. e. 9. ____ 18. Which of the following is true about all Normal distributions? a. (^) There is no area under the curve for z -scores greater than 3.49. b. (^) The area underneath the curve is equal to 1.

about 116. d. (^) The correlation between amount of fat and calories is positive. e. If one cereal has 140 calories and 5 g of fat. Its residual is about 5 calories. ____ 23. A copy machine dealer has data on the number of copy machines x at each of 89 customer locations and the number of service calls in a month y at each location. Summary calculations give = 8.4, = 2.1, = 14.2, = 3.8, and r = 0.86. What is the slope of the least-squares regression line of number of service calls on number of copiers? a. 0. b. 1. c. 0. d. 2. e. Can’t tell from the information given ____ 24. In the setting of the previous problem, about what percent of the variation in the number of service calls is explained by the linear relation between number of service calls and number of machines? a. 86% b. 93% c. 74% d. 55% e. Can’t tell from the information given ____ 25. A study examined the relationship between the sepal length and sepal width for two varieties of an exotic tropical plant. Varieties X and O are represented by x’s and o’s, respectively, in the following scatterplot. Which of the following statements is true? a. (^) Considering Variety X only, there is a positive correlation between sepal length and width. b. (^) Considering Variety O only, the least-squares regression line for predicting sepal length from sepal width has a positive slope. c. (^) Considering both varieties together, there is a negative correlation between sepal length and width. d. (^) Considering each variety separately, there is a negative correlation between sepal

length and width e. (^) Considering both varieties together, the least-squares regression line for predicting sepal length from sepal width has a negative slope. ____ 26. Suppose we fit a least-squares regression line to a set of data. What is true if a plot of the residuals shows a curved pattern? a. (^) A straight line is not a good model for the data. b. (^) The correlation must be 0. c. (^) The correlation must be positive d. Outliers must be present e. (^) The regression line might or might not be a good model for the data, depending on the extent of the curve. ____ 27. Mr. Nerdly asked the students in his AP Statistics class to report their overall grade point averages and their SAT Math scores. The scatterplot below provides information about his students’ data. The dark line is the least-squares regression line for the data, and its equation is. Which of the following statements about the circled point is true? a. (^) The standard score for this student’s GPA is positive. b. (^) If we used the least-squares line to predict this student’s SAT Math score, we would make a prediction that is too low. c. (^) This student’s residual is positive d. (^) Removing this data point would not change the correlation between SAT math score and GPA. e. Removing this student’s data point would decrease the slope of the least-squares line.

____ 30. A forester studying oak trees finds that the correlation between x = the ages (measured in years) and y = height (in feet) of a sample of trees is 0.78. Which of the following statements must be true? a. (^) 78% of the variability in tree heights can be explained by variation on the trees’ ages. b. (^) For every year a tree ages, it’s height increases, on average, by 78%. c. (^) If we let x = height of tree and y = age of tree, then the correlation would be the reciprocal of 0.78. d. (^) If we measure the height in meters instead of feet, the correlation would still be 0.78. e. (^) The unit for correlation in this context is foot-years. The computer output below predicts air fare for a discount airline’s flights from Philadelphia to 13 different cities from the flight’s length in miles. A scatterplot of the data suggests that the relationship is roughly linear. Predictor Coef SE Coef T P Constant 101.24 13.49 7.50 0. Distance 0.02977 0.01107 2.69 0. S = 20.7237 R-Sq = 39.7% R-Sq(adj) = 34.2% ____ 31. One flight—Philadelphia to West Palm Beach, FL—is 953 miles long and costs $110. Which of the following expressions correctly represents the residual for this data point? a. b. c. d. e. ____ 32. Which of the following best describes what S = 20.7237 measures? a. (^) The standard deviation of air fares b. (^) The standard deviation of flight distances c. (^) The standard deviation of the residuals d. (^) The slope of the least squares regression line e. (^) The standard deviation of the slope The scatterplot at right shows the relationship between carbohydrates and protein in one-cup servings of 15 different varieties of beans.

____ 33. The unusual point in the upper left part of the plot is for navy beans, with 15.8 grams of protein and 15.8 grams of carbohydrates. Which of the following best describes how correlation would change if we removed navy beans from the data set? a. (^) The correlation would be closer to 1, because the remaining data would have a stronger positive relationship. b. (^) The correlation would be closer to 1, because there would be fewer individuals in the data set. c. (^) The correlation would be closer to 0, because the data would more closely resemble a straight line. d. (^) The correlation would be closer to 0, because the standard deviation of the residuals would be smaller. e. (^) Correlation could no longer be calculated, because the remaining data would fall into two distinct groups ____ 34. The protein content for the 15 bean varieties has a mean of 12.2 grams and a standard deviation of 5.3 grams. The mean carbohydrate content is 33.6 grams with a standard deviation of 15.7 grams. The correlation is 0.84. Which of the following expressions represents the slope of the least squares regression of y = protein content on x = carbohydrate content? a. b. c. d. e.

____ 39. 1200 tomatoes have a mean weight of 143 grams and a standard deviation of 35 grams. If the weights are Normally distributed, approximately how many tomatoes weight between 73 grams and 178 grams? a. 384 b. 600 c. 816 d. 978 e. 1140