Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics Exam, National University of Ireland, Galway, Spring 2007-2008, Exams of Statistics

Information about a statistics exam held at the national university of ireland, galway during the spring semester of 2007-2008. Details about the exam such as the subjects covered, the professors involved, and the time allowed. It also includes various statistical problems that students were expected to solve, including calculating means, medians, modes, and standard deviations, as well as interpreting histograms and performing hypothesis tests.

Typology: Exams

2011/2012

Uploaded on 11/28/2012

sathyai
sathyai 🇮🇳

5

(5)

73 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
OLLSCOIL NA h´
EIREANN, GAILLIMH
NATIONAL UNIVERSITY OF IRELAND, GALWAY
SPRING EXAMINATIONS, 2007 2008
STATISTICS
MA419 (1AM1, 3EV1, 3EV2)
BC852 (1EM1, 1AS1, 1CB1)
Professor E.M. Scott
Professor J. P. Hinde,
Paul Wilson, M.A., H.D.E.
Time allowed: Three Hours.
Answer any four questions.
All questions, but not necessarily parts therein, carry equal marks.
A list of Formulæ is attached.
All required statistical distribution tables are to be found in the supplied mathematical (“log”) tables.
Question One is on the next page
1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Statistics Exam, National University of Ireland, Galway, Spring 2007-2008 and more Exams Statistics in PDF only on Docsity!

OLLSCOIL NA h´EIREANN, GAILLIMH

NATIONAL UNIVERSITY OF IRELAND, GALWAY

SPRING EXAMINATIONS, 2007 — 2008

STATISTICS

MA419 (1AM1, 3EV1, 3EV2)

BC852 (1EM1, 1AS1, 1CB1)

Professor E.M. Scott Professor J. P. Hinde, Paul Wilson, M.A., H.D.E.

Time allowed: Three Hours. Answer any four questions. All questions, but not necessarily parts therein, carry equal marks. A list of Formulæ is attached. All required statistical distribution tables are to be found in the supplied mathematical (“log”) tables.

Question One is on the next page

  1. (a) The lengths (in millimetres) of the tail–feather of seventy randomly selected birds of a given species were as follows:

117.0 118.5 112.7 119.9 111.7 129.9 114.6 107.7 110.4 112. 111.5 110.2 111.7 114.2 111.5 111.2 113.8 110.2 115.3 110. 101.2 110.6 118.4 127.9 112.3 120.7 113.1 113.9 116.4 114. 122.8 125.3 121.1 124.4 129.3 116.0 122.6 110.5 110.4 115. 112.8 119.4 113.1 114.9 119.1 113.4 112.6 112.5 110.4 125. 114.0 112.2 113.2 108.0 117.5 111.1 111.8 116.5 117.6 115. 120.8 110.7 112.5 112.0 110.9 113.6 113.4 115.1 110.4 120.

Illustrate the data with a histogram with intervals 100.0–110.0, 110.0–115.0, 115.0–120.0, 120 .0–125.0 and 125.0–130.0. (It is not necessary to use graph paper, but you may do so if you wish.) (b) Find the mean, median and mode (if any) and (sample) standard deviation of the following: 112 136 143 166 101 111 221 226 141 107 (You may use a calculator). (c) A survey of ten people was taken. In this survey the people were asked to state their sex (Male = 1, Female = 2), their preferred method of exercise (1 = Gym, 2 = Walking/Jogging, 3 = Swimming, 4 = Playing Sport, 5 = None of these), their mean annual expenditure on ”keeping fit”, and to indicate, on a scale from 0 – 4, the extent to which they were concerned about their health (0 = not worried, 4 = extremely worried). The results of the survey are given below. i. State whether each variable (column heading) is a nominal, ordinal, discrete interval or continuous interval variable. ii. Calculate an appropriate measure of central tendency (average) for each variable.

Sex Pref Expenditure Concern 1 2 100 3 1 1 400 1 2 1 140 4 1 3 511 2 1 3 125 2 1 5 980 1 2 1 380 2 2 2 200 3 2 2 212 1 1 2 170 0

(d) Events A and B are independent. P (A) = 0.3, P (B) = 0.2. Calculate: i. the probability that neither A nor B occurs, ii. the probability that B but not A occurs.

Question Two is on the next page

  1. (a) A large company believes that 2% of its employees are aged between 16 and 20, 25% between 21 and 35, 35% between 36 and 50, 32% between 51 and 60, and 6% between 61 and 66. A director of the company doubts this and randomly surveys 100 of the staff. The results of this survey are:

Age 16–20 21–35 36–50 51–60 61– Quantity 6 28 30 32 4

At a level of significance of α = 0.05, do the results of the survey provide evidence that the director is correct? Your answer should include reference to appropriate hypothesis tests and assumptions underlying the test you use. (b) 300 adults were classified according to their sex and views on genetically modified food. The results are summarised in the table below. In Favour No Opinion Against Male 36 61 63 Female 26 45 69

Is there evidence, at α = 0.05, of a relationship between the sex of a person and his or her views on genetically modified food? Your answer should include reference to appropriate hypothesis tests and assumptions underlying the test you use.

This question is continued on the next page

(c) Researchers wish to investigate if the heights of a certain population of males is normally distributed. A sample of the heights of 100 such males is taken, and the mean sample height is 70 inches, with a sample standard deviation of 2 inches. The minitab output for an analysis of this problem is given below. (Note that 10 people were observed with heights in the interval 66.5-67.5 etc. The “66” category may be taken to include all heights less than 66.5 inches, and the “74” category all heights greater than 73.5 inches.) The minitab output from a χ^2 goodness of fit test is given below.

Chi-Square Goodness-of-Fit Test for Observed Counts in Variable: Count

Using category names in Height

Historical Test Contribution Category Observed Counts Proportion Expected to Chi-Sq 66 8 0.040 0.040040 4.0040 3. 67 10 0.066 0.066066 6.6066 1. 68 7 0.121 0.121121 12.1121 2. 69 17 0.174 0.174174 17.4174 0. 70 19 0.197 0.197197 19.7197 0. 71 19 0.174 0.174174 17.4174 0. 72 10 0.121 0.121121 12.1121 0. 73 10 0.066 0.066066 6.6066 1. 74 0 0.040 0.040040 4.0040 4.

N DF Chi-Sq P-Value 100 8 14.1840 0.

i. Based upon p-value in the above output, may we, at α = 0.05, reject the null hypothesis that the population heights are normally distributed? Justify your answer. ii. Here the “expected values” are based upon “historical counts” that have been calculated by the researchers, based upon the data following a N (70, 4) distribution. Given infor- mation in the question, why should we conclude that the number of degrees of freedom stated in the output is in fact incorrect? iii. Based upon the correct number of degrees of freedom, may we, at α = 0.05, reject the null hypothesis that the population heights are normally distributed? Justify your answer. iv. Is there anything else in the minitab output that might raise (slight) concern about the validity of the analysis?

Question Four is on the next page

(c) Nine people had their blood sugar level recorded before (Glucose1) and after (Glucose2) undertaking a strict diet. It is wished to investigate whether this diet will affect blood glucose levels. The data were analysed using a paired t-test. The minitab output is presented in Figure 2

Figure 2: Blood Sugar Levels Before and After Diet

Results for: Bloodsugar2.MTW

Paired T-Test and CI: Glucose1, Glucose

Paired T for Glucose1 - Glucose

N Mean StDev SE Mean Glucose1 9 90.67 26.72 8. Glucose2 9 89.44 27.13 9.

Difference 9 1.22 6.08 2.

95% CI for mean difference: (-3.45, 5.89) T-Test of mean difference = 0 (vs not = 0): T-Value = 0.60 P-Value = 0.

i. State the null and alternative hypotheses that are being tested above, ii. Do we reject this null hypothesis? Justify and interpret your answer. iii. It would be possible to analyse these data using an independent samples t-test. Why is the above method preferable in this case?

Question Five is on the next page

  1. (a) i. In relation to statistical regression, what is meant by the terms

A. residual, B. least squares regression line? ii. What is meant by the term correlation? Your answer should be illustrated by diagrams and include brief explanations of what is meant by strong/weak and positive/negative correlation. iii. Explain how it is possible that two variables may have a correlation coefficient close to one, indicating strong correlation, but the p-value associated with the correlation coefficient may also be large, indicating that it unsafe to assume a non–zero correlation. (b) The rate of flow of a stream at a given point, its depth at that point (Depth1) and at a another point (Depth2) was recorded. The results are shown below. Flow 0. 636 0. 319 0. 734 1. 327 0. 040 1. 300 7. 350 5. 890 3. 102 1. 824 Depth1 0. 34 0. 29 0. 28 0. 42 0. 34 0. 45 0. 76 0. 73 0. 51 0. 40 Depth2 0. 96 0. 92 0. 90 0. 85 0. 84 0. 84 0. 82 0. 80 0. 83 0. 86 minitab regression analyses of Flow vs. Depth1 and Flow versus Depth2 are presented below. For the moment, assume that these analyses are suitable for the data. i. Which of these models would you prefer? Justify your answer. ii. Interpret the various p-values given in the output.

Regression Analysis: Flow versus Depth

The regression equation is Flow = - 4.15 + 14.2 Depth

Predictor Coef SE Coef T P Constant -4.1546 0.5921 -7.02 0. Depth1 14.174 1.234 11.49 0.

S = 0.629326 R-Sq = 94.3% R-Sq(adj) = 93.6%

Regression Analysis: Flow versus Depth

The regression equation is Flow = 30.0 - 32.2 Depth

Predictor Coef SE Coef T P Constant 30.00 11.68 2.57 0. Depth2 -32.19 13.53 -2.38 0.

S = 2.01454 R-Sq = 41.4% R-Sq(adj) = 34.1%

This question is continued on the next page

Formulæ

  1. Mean x ¯ =

xi n

  1. (Sample) Variance

s^2 =

(xi−x¯)^2 n− 1 =

x^2 i −n¯x^2 n− 1

  1. (Population) Variance

σ^2 =

(xi−x¯)^2 N =

x^2 i −N x¯^2 N

  1. Binomial Distribution

P (X = r) =

n r

pr^ q(n−r)

μ = np σ^2 = npq

  1. Poisson Distribution P (X = r) = e

−λλr r! μ = λ σ^2 = λ

  1. Two Sample z–test zobs = (¯x^1 −^ √¯x^2 σ)− 2 (μ^1 −μ^2 ) n^1 1 +^

σ^22 n 2

  1. One Sample z–test zobs = (^) σ/x¯−√μn
  2. One Sample t–test tobs = (^) s/¯x−√μn
  3. Independent Samples t–test tobs = x¯^1 −¯x^2 sp

n 1 +^ n^12 where: s^2 p = (n^1 −1)s

(^21) +(n 2 −1)s (^22) n 1 +n 2 − 2

  1. χ^2 Goodness of fit test, Independent Samples test

χ^2 obs =

∑ (^) (obs-exp) 2 exp