Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics 110: Introduction to Statistical Analysis - Class Outline and Key Concepts, Slides of Statistics

An outline for a statistics 110 class session, covering topics such as basic terminology, types of data, descriptive and inferential statistics, observational and experimental studies, and building statistical models. The session includes examples and exercises on concepts like population and sample, variables, parameters and statistics, descriptive and inferential statistics, observational and experimental studies, and model building.

What you will learn

  • What is the difference between a population and a sample?
  • What are the differences between categorical and quantitative data?
  • How can observational studies and experimental studies be used to establish cause and effect relationships?

Typology: Slides

2021/2022

Uploaded on 09/12/2022

tanvir
tanvir 🇺🇸

5

(4)

224 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
9/26/2017
1
STATISTICS 110
Outline for today:
Go over syllabus and dates for the quarter
Overview of basic terminology
Cover most of Chapter 0
Overview of coverage in this course and in
Stat 111/202
Examples on White Board
1. Ex 0.4: Do students with higher GPA have a
better chance of getting into med school?
MedGPA includes Accept/Deny and GPA
2. Ex 0.6: Do financial incentives help people
lose weight? Randomly assigned to get
incentive or not (control group)
WeightLossIncentive4 and page 8.
Some Fundamental Definitions
Population: All of the individual units about
which we want information
Examples on white board
Sample: Units for which we obtain data
Examples on white board
A variable: Something we measure (for sample)
or could measure (for population) on each unit
Examples on white board
Types of Data (Variables)
Categorical: Data consist of category names
Male/Female (two categories = binary)
Level of education (ordered categories = ordinal)
Smoker/nonsmoker
Opinion on an issue (favor, oppose, no preference)
Admit status (for med school example)
Quantitative: Data consist of numbers where
ordinary arithmetic makes sense
Height, weight, GPA, number of siblings
More Fundamental Definitions
(Population) Parameter:
A number associated with a population
Example: Proportion admitted to med school
for the population of applicants with GPA of at
least 3.5.
(Sample) Statistic:
A number associated with a sample
Example: Proportion admitted to med school
for the observed sample of applicants with GPA
of at least 3.5.
Description or Decision?
How Data Are Used
Descriptive Statistics: using numerical and
graphical summaries to characterize
a data set (and only that data set).
Inferential Statistics: using sample
information to make conclusions about
a population.
Models: Used to approximate the population
relationship between two (or more) variables.
This course is all about finding good models!
pf3
pf4

Partial preview of the text

Download Statistics 110: Introduction to Statistical Analysis - Class Outline and Key Concepts and more Slides Statistics in PDF only on Docsity!

STATISTICS 110

Outline for today:

 Go over syllabus and dates for the quarter

 Overview of basic terminology

 Cover most of Chapter 0

 Overview of coverage in this course and in Stat 111/

Examples on White Board

  1. Ex 0.4: Do students with higher GPA have a better chance of getting into med school? MedGPA includes Accept/Deny and GPA
  2. Ex 0.6: Do financial incentives help people lose weight? Randomly assigned to get incentive or not (control group) WeightLossIncentive4 and page 8.

Some Fundamental Definitions

  • Population: All of the individual units about which we want information - Examples on white board
  • Sample: Units for which we obtain data
    • Examples on white board
  • A variable: Something we measure (for sample) or could measure (for population) on each unit - Examples on white board

Types of Data (Variables)

  • Categorical: Data consist of category names
    • Male/Female (two categories = binary )
    • Level of education (ordered categories = ordinal )
    • Smoker/nonsmoker
    • Opinion on an issue (favor, oppose, no preference)
    • Admit status (for med school example)
  • Quantitative: Data consist of numbers where ordinary arithmetic makes sense - Height, weight, GPA, number of siblings

More Fundamental Definitions

(Population) Parameter:

A number associated with a population

  • Example: Proportion admitted to med school for the population of applicants with GPA of at least 3.5. (Sample) Statistic:

A number associated with a sample

  • Example: Proportion admitted to med school for the observed sample of applicants with GPA of at least 3.5.

Description or Decision? How Data Are Used

  • Descriptive Statistics: using numerical and

graphical summaries to characterize

a data set (and only that data set).

  • Inferential Statistics: using sample

information to make conclusions about

a population.

  • Models: Used to approximate the population

relationship between two (or more) variables.

This course is all about finding good models!

Definitions of Types of Studies

Observational Study:

  • Researchers observe or question participants about opinions, behaviors, or outcomes.
  • Participants not asked to do anything different.
  • Example: We cannot randomly assign students to have GPA above/below 3.5!

Two special cases:

Sample surveys and Case-control studies.

Experiment:

Researchers manipulate something and

measure the effect of the manipulation

on some outcome of interest.

Randomized experiments: participants are

randomly assigned to participate in one

condition (called treatment ) or another.

Sometimes cannot conduct experiment due

to practical/ethical issues.

NOT the same thing as random sampling.

Two Important Issues Based on Data Collection Method

  • Extending results to a population: This

can be done if the data are representative

of a larger population for the question of

interest. Safest to use a random sample.

  • Cause and effect conclusion: Can only be

made if data are from a randomized

experiment, not from an observational

study.

  • Examples on white board

Types of Variables (Measured or Not)

  • Explanatory variable (or independent variable) is one that may explain or may cause differences in a response variable (or outcome or dependent variable).
  • A confounding variable is a variable that:
    • affects the response variable and also
    • is related to the explanatory variable.
  • Example: Admit (yes/no) is response variable and GPA is explanatory variable. Possible confounding variable is general ambition.

Example of an Observational Study:

Lead Exposure and Bad Teeth

Observational study involving 24,901 children. Explanatory variable = level of lead exposure. Response variable = extent child has missing/decayed teeth. Possible confounding variables = income level, diet, time since last dental visit.

“Children exposed to lead are more likely to suffer tooth decay …” USA Today

CRUCIAL POINT

This study is an observational study.

We cannot conclude that lead exposure

causes tooth decay.

It would be unethical to do a randomized

experiment, so we need other (non-

statistical) ways to establish cause and

effect.

FIT the model: Predicted Value for Y

Get an estimate for Y using the predictors

and the model with estimated parameter(s).

For the “constant” model, only 1 parameter.

Examples: (^) Y ˆ^  Y (c = Sample mean)

Y ˆ^  m (c = Sample median)

(1) Which estimator (mean or median)

is better?

(That is, how can we compare models?)

(2) Is either model any good?

(That is, how can we assess fit?)

Assessment Questions

Assessing Fit: Residuals

Residual  YY ˆ

Using the predicted value for each sample

point the residual is:

Actual Predicted

Assess fit by creating a summary of size of

the residuals – want it to be small!

Criteria to Minimize Residuals

Sum of residuals: (^)  ( YY ˆ)

Sum of absolute

deviations:

Y^ ^ Y ˆ

Sum of squared

errors:

 ( Y ^ Y ˆ)

Use the Model

After choosing a model, fitting it, and

assessing that it fits well, you can use it to:

  • Predict the response variable for an individual in the future, when you only know the value(s) of the explanatory variable(s)
  • Estimate the mean response for a specific value of the explanatory variable(s)
  • Extend results to a population, if appropriate
  • Determine causal relationships, if appropriate

Overview of Types of Models Response Explanatory Procedure Where Quantitative One quantitative

Simple linear regression

Chs 1 &

Quantitative Multiple Multiple regr. Chs 3, 4 Quantitative One categorical

One‐way ANOVA

Ch 5

Quantitative Binary Two‐sample t Stat 7 Quantitative Multiple cat. ANOVA Chs 6, 7 Categorical Categorical Chi‐square Stat 7 Categorical Quantitative Logistic regr. Stat 111 Categorical Multiple Logistic regr. Stat 111