Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Prof. Jin's ESS210B Lecture Notes: Probability Distributions and Data Analysis, Study notes of Statistics

A portion of Prof. Jin's lecture notes for ESS210B, covering topics such as probability distributions, normal distribution, student-t distribution, chi-square distribution, F distribution, significance tests, organizing data, finding relationships among data, testing significance of results, and using standard normal distribution. The notes also include examples and applications in operational climatology.

What you will learn

  • How do you test the significance of the results in data analysis?
  • What is the difference between a probability distribution and a normal distribution?
  • What is the purpose of organizing data in data analysis?
  • What is the role of the standard normal distribution in data analysis?
  • How do you find the probability that a value of Z is greater than a certain value?

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

ekavir
ekavir 🇺🇸

4.3

(31)

257 documents

1 / 79

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ESS210B
ESS210B
Prof. Jin
Prof. Jin-
-Yi Yu
Yi Yu
Part 1:
Part 1: Probability Distributions
Probability Distributions
Probability Distribution
Normal Distribution
Student-tDistribution
Chi Square Distribution
FDistribution
Significance Tests
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f

Partial preview of the text

Download Prof. Jin's ESS210B Lecture Notes: Probability Distributions and Data Analysis and more Study notes Statistics in PDF only on Docsity!

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Part 1:

Part 1:

Probability Distributions

Probability Distributions

Probability Distribution

Normal Distribution

Student-

t

Distribution

Chi Square Distribution

F

Distribution

Significance Tests

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Purposes of Data Analysis

Purposes of Data Analysis

Organizing Data Organizing Data

Find Relationships among Data Find Relationships among Data

Test Significance of the ResultsTest Significance of the Results

True Distributions or Relationships in the Earths System

Sampling

Weather Forecasts,Weather Forecasts,

Physical Physical

Understanding, Understanding, …

….

.

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Variables and Samples

Variables and Samples

‰ ‰

Random Variable Random Variable

: A variable whose values occur at

random, following a probability distribution.

‰ ‰

Observation Observation

: When the random variable actually attains

a value, that value is called an observation (of thevariable).

‰ ‰

Sample Sample

: A collection of several observations is called

sample. If the observations are generated in a randomfashion with no bias, that sample is known as a randomsample.

By observing the distribution of values in a random sample,we can draw conclusions about the underlying probabilitydistribution.

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Probability Distribution

Probability Distribution

The pattern of probabilities for a set of events is called aprobability distribution. (1) The probability of each event or combinations of events must range from

0 to 1.

(2) The sum of the probability of all possible events must be equal too 1.

continuous probability distribution

discrete probability distribution

PDF

PDF

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Probability Density Function

Probability Density Function

P

= the probability that a randomly selected value of a variable X falls

between a and b.

f(x)

= the probability density function.

The probability function has to be integrated over distinct limits toobtain a probability.

The probability for X to have a particular value is ZERO.

Two important properties of the probability density function:

(1)

f(x)

0 for all x within the domain of

f.

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Cumulative Distribution Function

Cumulative Distribution Function

The cumulative distribution function

F(x)

is defined as the probability

that a variable assumes a value less than

x

The cumulative distribution function is often used to assist incalculating probability (will show later).

The following relation between F and P is essential for probabilitycalculation:

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Standard Normal Distribution

Standard Normal Distribution

The standard normal distribution has a mean of 0 and a standarddeviation of 1.

This probability distribution is particularly useful as it can representany normal distribution, whatever its mean and standard deviation.

Using the following transformation, a normal distribution of variable Xcan be converted to the standard normal distribution of variable Z:

Z = ( X -

μ

σ

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Transformations

Transformations

It can be shown that any frequency function can be transformed in to afrequency function of given form by a suitable transformation orfunctional relationship.

For example, the original data follows some complicated skeweddistribution, we may want to transform this distribution into a knowndistribution (such as the normal distribution) whose theory andproperty are well known.

Most geoscience variables are distributed normally about their mean orcan be transformed in such a way that they become normallydistributed.

The normal distribution is, therefore, one of the most importantdistribution in geoscience data analysis.

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Another Example

Another Example

Example 2: What is the probability that Z lies between the limits Z

1

and Z

2

Answer:

P(Z
P(Z < 0.60)
Î
P(Z >-0.60)=
P(-0.
Z
0.75) = P(Z
P(Z

negative value

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Example 1

Example 1

Given a normal distribution with

μ

= 50 and

σ

= 10, find the probability

that X assumes a value between 45 and 62. The Z values corresponding to

x

1

= 45 and

x

2

= 62 are

Therefore, P(45 <

X
= P(-0.5 <
Z
= P(Z <1.2) – P(Z<-0.5)= 0.8849 – 0.3085= 0.

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Probability of Normal Distribution

Probability of Normal Distribution

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Probability of Normal Distribution

Probability of Normal Distribution

There is only a 4.55% probability that a normally distributed variable willfall more than 2 standard deviations away from its mean.

This is the two-tailed probability. The probability that a normal variablewill exceed its mean by more then 2

σ

is only half of that, 2.275%.

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

How to Estimate

How to Estimate

and

and

In order to use the normal distribution, we need to knowthe mean and standard deviation of the population

Î

But they are impossible to know in most geoscienceapplications, because most geoscience populations areinfinite.

Î

We have to estimate

and

from samples.

and

ESS210BESS210BProf. JinProf. Jin-

-Yi Yu

Sampling Distribution

Sampling Distribution

Is the sample mean close to the population mean?

To answer this question, we need to know the probabilitydistribution of the sample mean.

We can obtain the distribution by repeatedly drawingsamples from a population and find out the frequencydistribution.