Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Basic Biostatistics formula sheet, Cheat Sheet of Biostatistics

New York Institute of Technology (NYIT) - Jonesboro Biostatistics

Biostatistics formula sheet include sum of squares, mean, variance, deviation, median, range, upper and lower fence. From San Jose State university.

Typology: Cheat Sheet

2021/2022

Uploaded on 02/07/2022

thecoral 🇺🇸

4.4

(29)

401 documents

1 / 3

This page cannot be seen from the preview

Don't miss anything!

Basic Biostatistics Formulas

Jane Pham & B. Burt Gerstman

C:\data\biostat-text\formulas1.doc Last printed 12/20/2007 4:39:00 PM Page 1 of 3

Exploratory and Summary Statistics (Chapters 3 & 4)

Statistic Parameter Point Estimate Formula Interprétation Notes

Sum of squares df×

2

σ

SS ∑

=

−= n

iixxSS

1

2

)(

No easy interpretation.

Mean μ

x

∑

=

=n

ii

x

n

x

1

1 A measure of central

location; balancing pt.

Variance σ2 s2 1

2

−

=n

SS

s A measure of spread

expressed in units squared

Standard

Deviation σ s 2

ss = or 1−n

SS A measure of spread

expressed in data units.

More appropriate for

descriptive purposes.

• Mean and standard deviation are best

suited to symmetrical distributions.

• When distribution is Normal, 68% of data

points lie within +1σ of µ, 95% within

+2σ of µ, and 99.7% lie within +3σ of µ

• For other distributions, use Chebychev’s

rule (e.g., at least 75% of data lie within

+2σ of µ).

Statistic Formula Interpretation 5-point Summary Notes of boxplot

Median

Median has depth of

2

1+n

A measure of central

location

Interquartile

Range

()

IQR 13 QQIQR −= A measure of spread,

aka “hinge-spread”

Lower Fence

()

l

F

()

IQRQFl5.11−=

Helps determine:

Lower inside value

Lower outside value(s)

Upper Fence

()

u

F

()

IQRQFu5.13 +=

Helps determine:

Upper inside value

Upper outside value(s)

Q0 – Minimum

Q1 – First Quartile

Q2 – Median

Q3 – Third quartile

Q4 – Maximum

• Provide information about locations, spread, and

shape. The box contains middle 50% of data. Line

inside the box is the median.

• Anything above the upper fence or below the lower

fence is “outside.” (Fences are not drawn.) Plot

outside values as separate points.

• The lower whisker is drawn from Q1 to the lower

inside value. The upper whisker is drawn from Q3

to the upper inside value.

Partial preview of the text

Download Basic Biostatistics formula sheet and more Cheat Sheet Biostatistics in PDF only on Docsity!

Basic Biostatistics

Formulas

Jane Pham & B. Burt Gerstman

C:\data\biostat-text\formulas1.doc Last printed 12/20/2007 4:39:00 PM

Page 1 of 3

Exploratory and Summary Statistics (Chapters 3 & 4) Statistic

Parameter

Point Estimate

Formula

Interprétation

Notes

Sum of squares

df

2 ×

σ^

SS

∑=

n i

i^

x

SS

1

(^

No easy interpretation.

Mean

μ^

x^

∑=

n i

xi

n

x^

1

A measure of centrallocation; balancing pt.

Variance

2

2 s

2

=^

SSn

s^

A measure of spreadexpressed in units squared

StandardDeviation

σ^

s^

2 s

s^

=^

or

SSn

A measure of spreadexpressed in data units.More appropriate fordescriptive purposes.

-^

Mean and standard deviation are bestsuited to symmetrical distributions.

-^

When distribution is Normal, 68% of datapoints lie within +

σ^

of μ, 95% within

σ^

of μ, and 99.7% lie within +

σ^

of μ

-^

For other distributions, use Chebychev’srule (e.g., at least 75% of data lie within+

σ^

of μ).

Statistic

Formula

Interpretation

5-point Summary

Notes of boxplot

Median

Median has depth of

n^

A measure of centrallocation

InterquartileRange

(

)

IQR

Q

IQR

=^

A measure of spread,aka “hinge-spread”

Lower Fence^ (^

) l

F^

(^

)

IQR

Q

Fl

Helps determine:Lower inside valueLower outside value(s)

Upper Fence^ (^

) u

F^

(^

)

IQR

Q

Fu

Helps determine:Upper inside valueUpper outside value(s)

Q0 – MinimumQ1 – First QuartileQ2 – MedianQ3 – Third quartileQ4 – Maximum

-^

Provide information about locations, spread, andshape. The box contains middle 50% of data. Lineinside the box is the median.

-^

Anything above the upper fence or below the lowerfence is “outside.” (Fences are

not

drawn.)

Plot

outside values as separate points.

-^

The lower whisker is drawn from Q1 to the lowerinside value. The upper whisker is drawn from Q3to the upper inside value.

Basic Biostatistics

Formulas

Jane Pham & B. Burt Gerstman

C:\data\biostat-text\formulas1.doc Last printed 12/20/2007 4:39:00 PM

Page 2 of 3

Probability (Chapters 5–7)

^

Probability

relative frequency in the population; expected proportion after a very long run of trials; can be used to quantify subjective statements.

^

Properties of probabilities Basic: (1) 0

Pr(A)

1; (2) Pr(S) = 1; (3) Pr(

−Pr(A); and (4) Pr(A or B) = Pr(A) + Pr(B) for disjoint events.

Advanced: (5) If A and B are independent, Pr(A and B) = Pr(A) · Pr(B) (6) Pr(A or B) = Pr(A) + Pr(B)

Pr(A and B) (7) Pr(B|A) = Pr(A and B) / Pr(A) (8) Pr(A

and B) = Pr(A) · Pr(B|A) (9) Pr(B) = [Pr(B and A)] + Pr(B and

) (10) Bayes’ Theorem (p. 111)

^

Binomial variables

:^ X

~ b(

n ,^

p ),

x n x x n^

q p C x X^

−

=^

Pr(

where

!^ x n x

n

C^ xn

=^

and

q

p

^

Cumulative probability:

Pr(

X^

≤^

x ) = sum all probabilities up to and including Pr(

X^

=^

x ); corresponds to AUC in the left tail of the

pmf

or

pdf.

^

Normal variables

:^ X

~ N(

μ,

σ

). To determine Pr(

X^

≤^

x ), standardize

=^

x z^

and look up cumulative probability in

Z

table. Use the fact that the AUC sums to 1

to determine probabilities for various ranges.To find a value that corresponds to a given probability, look up closest

z^ p

in the Z table and then unstandardize according to

x^

=^

μ^

+^

z^ p

·σ.

Introduction to Inference (Chapters 8–11)

^

The

sampling distribution of the mean (SDM)

is governed by the central limit theorem, law of large numbers, and square root law. When

n^

is large,

~^

x

N x

μ^

where

x

σ^

is the standard error (

SE

) and is equal to

σ^ n

. The standard estimate is estimated by

s^ n

when the population standard deviation is

not known. ^

α)100% confidence interval for

μ

.^ Use

x SE z x^

±^

α^ − 12

when

σ

is known. Use

x

n^

SE

t x^

⋅

±^

− −^ 12 , 1

α^

when relying on

s.

^

Hypothesis testing basics.

Know all the steps, not just the conclusion and keep in mind that hypothesis tests require certain conditions (e.g., Normality,

independence, data quality) to be valid. The steps are:A.

H

and 0

H

[For one-sample test of a mean, 1

H

0: μ = μ

where μ 0

is the mean specified by the null hypothesis.] 0

B. Test statistic [For one-sample test of a mean, use either

x x SE

z^

0

stat

=^

or

with 0

stat

=^

n df

x SE

t

μ^ x

.]

C. P

-value. Convert the test statistic to a

P

-value. Small

P

strong evidence against

H

D. Significance level. It is unwise to draw too firm a line. However, you can use the conventions regarding marginal significance, significance, and highsignificance when first learning. ^

Power and sample size basics.

Approach from estimation, testing, or “power” perspective. Sample size requirement for limiting margin of error

m

is given by

2

1

2

=^

−^

m z n

The power of testing a mean is

⎛^ ⎜⎜⎝

−^

−

α

n

z^

2 1

.^ The sample size requirement of a one-sample

z^

or

t^

test:

(^

2

2 1

1 2

2

Δ

=^

−

α

β

σ^

z

n^

. It is OK to use

s^

as a substitute for

σ

in power and sample size formulas, when necessary.

Basic Biostatistics formula sheet, Cheat Sheet of Biostatistics

Related documents

Partial preview of the text

Download Basic Biostatistics formula sheet and more Cheat Sheet Biostatistics in PDF only on Docsity!

Basic Biostatistics

Formulas

Jane Pham & B. Burt Gerstman

Parameter

Point Estimate

Formula

Interprétation

Notes

Sum of squares

df

2 ×

SS

x

x

SS

(^

No easy interpretation.

Mean

μ^

x^

xi

n

x^

A measure of centrallocation; balancing pt.

Variance

2 s

=^

SSn

s^

A measure of spreadexpressed in units squared

StandardDeviation

σ^

s^

2 s

s^

=^

or

SSn

A measure of spreadexpressed in data units.More appropriate fordescriptive purposes.

Mean and standard deviation are bestsuited to symmetrical distributions.

When distribution is Normal, 68% of datapoints lie within +

σ^

of μ, 95% within

σ^

of μ, and 99.7% lie within +

σ^

of μ

For other distributions, use Chebychev’srule (e.g., at least 75% of data lie within+

σ^

of μ).

Statistic

Formula

Interpretation

5-point Summary

Notes of boxplot

Median

Median has depth of

n^

A measure of centrallocation

InterquartileRange

IQR

Q

Q

IQR

=^

A measure of spread,aka “hinge-spread”

F^

IQR

Q

Fl

Helps determine:Lower inside valueLower outside value(s)

F^

IQR

Q

Fu

Helps determine:Upper inside valueUpper outside value(s)

^

^

^

^

^

^

^