














































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
What to do when a distribution is not bell-shaped, introducing Chebyshev's Rule and its implications for data distribution. It covers linear and nonlinear transformations, normal distributions, and the 68-95-99.7 rule. Examples and exercises are provided.
Typology: Study notes
1 / 54
This page cannot be seen from the preview
Don't miss anything!
week
With a bell shaped distribution, ¾
about 68% of the data fall within a distance of 1 standard^ deviation from the mean. ¾
95% fall within 2 standard deviations of the mean. ¾
99.7% fall within 3 standard deviations of the mean.
-^
What if the distribution is not bell-shaped? There is another rule, named Chebyshev's Rule, that tells us that there must be at least 75% of the data within 2 standard deviations of the mean, regardless of the shape, and at least 89% within 3 standard deviations.
week
A linear transformation changes the original value
x
into a
new variable
x
new
x new
is given by an equation of the form,
Example 1.19 on page 54 in IPS.(i) A distance
x
measured in km. can be expressed in
miles as follow,
(ii) A temperature x measured in degrees Fahrenheit can be
converted to degrees Celsius by
x^
a^
b x
n e w
=
0 .6 2
x^
x
n e w
=
5
160
5
(^
9
9
9
x^
x^
x
new
=^
−^
=−
week
Measure
new
MeanMedian
M Mode
Range
Stdev
a +
b M
Mode
a +
b Mode
week
A sample of 20 employees of a company was taken andtheir salaries were recorded. Suppose each employeereceives a $300 raise in the salary for the next year. State whether the following statements are true or false.
a)
The
of the salaries will
i.^
be unchanged
ii.
increase by $
iii.
be multiplied by $
b)
The mean of the salaries will i.^
be unchanged
ii.
increase by $
iii.
be multiplied by $
week
0
1
2
3
4
5
6
7
8
9
10
60 50 40 30 20 10 0
ln(sales)
Frequency
Histogram for ln(sales)
0
1000 2000 3000 4000 5000 6000 7000 8000 9000
200 100 0
Sales
Frequency
Histogram for sales data
week
Using software, clever algorithms can describe a distributionin a way that is not feasible by hand, by fitting a smooth curveto the data in addition to or instead of a histogram. The curvesused are called
density curves
It is easier to work with a smooth curve, because histogramdepends on the choice of classes.
-^
Density Curve Density curve is a curve that^ ¾
is always on or above the horizontal axis. ¾
has area exactly 1 underneath it.
A density curve describes the overall pattern of a distribution.
week
The
median
of a distribution described by a density curve
is the point that divides the area under the curve in half.
-^
mode
of a distribution described by a density curve is a
peak point of the curve, the location where the curve ishighest.
-^
Quartiles
of a distribution can be roughly located by
dividing the area under the curve into quarters asaccurately as possible by eye.
week
An important class of density curves are the symmetricunimodal bell-shaped curves known as
normal curves
. They
describe
normal distributions
All normal distributions have the same overall shape.
-^
The exact density curve for a particular normal distribution isspecified by giving its mean
μ
and its standard deviation
σ
The mean is located at the center of the symmetric curve andis the same as the median and the mode.
-^
Changing
μ
without changing
σ
moves the normal curve
along the horizontal axis without changing its spread.
week
There are other symmetric bell-shaped density curves thatare not normal e.g.
t^
distribution.
The normal density curves are specified by a particularfunction. The height of a normal density curve at any point x^
is given by
-^
Notation: A normal distribution with mean
μ
and standard
deviation
σ
is denoted by
(μ
,^ σ
2
1
1
2
2
x
e
μ σ
σ^
π
⎛^
⎞
⎜^
⎟
⎜^
⎟
⎜^
⎟
⎜^
⎟
⎜^
⎟
⎜^
⎟
⎝^
⎠ −
−
week
14
In the normal distribution with mean
μ
and standard deviation
σ
Approx. 68% of the observations fall within
σ
of the mean
μ
Approx. 95% of the observations fall within 2
σ
of the mean
μ
Approx. 99.7% of the observations fall within 3
σ
of the mean
μ
week
If
x
is an observation from a distribution that has mean
μ
and
standard deviation
σ
, the standardized value of
x^
is given by
-^
A standardized value is often called a
z
-score.
z
-score tells us how many standard deviations the original observation falls away from the mean of the distribution.
-^
Standardizing is a linear transformation that transform the datainto the standard scale of
z
-scores. Therefore, standardizing does
not change the shape of a distribution, but changes the value ofthe mean and stdev.
x
z
μ − σ
=
week
The heights of women is approximately normal with mean μ
= 64.5 inches and standard deviation
σ
= 2.5 inches.
The standardized height is
-^
The standardized value (z-score) of height 68 inches isor 1.4 std. dev. above the mean.
-^
A woman 60 inches tall has standardized heightor 1.8 std. dev. below the mean.
6 4.
h e ig h t z^
−
=
6 8
6 4.
z^
−
=^
=
60
z^
− =^
= −
week
Table A
gives cumulative proportions for the standard
normal distribution. The table entry for each value
z
is the
area under the curve to the left of
z,
the notation used is
z
e.g. P( Z
20
Standard Normal Distribution
z^
.
.
.
.
.
.
.
.
.
.
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.
.5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549 .7580 .7611 .7642 .7673 .7703 .7734 .7764 .7794 .7823 .7852 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990.
The table showsarea to left of ‘
z ’
under standardnormal curve For a negativenumber, -
z^
Area below (-
z ) =
Area above (
z ) =
1 – Area below (
z )