
















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An overview of statistical analysis in quantitative research methodologies. It covers data cleaning and organizing, data preparation, data transformation, and the distribution, central tendency, and dispersion of data. The document also introduces hypothesis testing and significance testing, as well as various statistical tests and correlation analysis.
What you will learn
Typology: Study Guides, Projects, Research
1 / 24
This page cannot be seen from the preview
Don't miss anything!
Prof. Dr. Hora Tjitra & Dr. He Quan
Statistical Analysis Process ... Describing the data, are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Data Preparation Descriptive Statistics Inferential Statistic ... Testing Hypotheses and Models, investigate questions, models and hypotheses. In many cases, the conclusions from inferential statistics extend beyond the immediate data alone. ... Cleaning and organizing the data for analysis, involves checking or logging the data in; checking the data for accuracy; entering the data into the computer; etc.
Descriptive Statistics (Univariate Analysis ) The Distribution: A summary of the frequency of individual values or ranges of values for a variable Central Tendency: A distribution is an estimate of the “center” of a distribution of values Dispersion: the spread of the values around the central tendency
The Distribution Frequency distributions can be depicted in two ways …. A table shows an age frequency distribution with five categories of age ranges defined A graph shows the frequency distribution. It is often referred to as a histogram or bar chart
distribution, the high value is 36 and the low is 15, so the range is 36 - 15 = 21.
Normal Distribution The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.
The Statistical Inference Decision Matrix In reality What we conclude H 0 is true,H 1 is false In reality... There is no relationship; There is no difference, no gain; Our theory is wrong H 0 is false,H 1 is true In reality... There is a relationship, There is a difference or gain, Our theory is correct We accept H 0 , reject H1. We say... "There is no relationship"; "There is no difference, no gain"; "Our theory is wrong"
The odds of saying there is no relationship, difference, gain, when in fact there is none The odds of correctly not confirming our theory 95 times out of 100 when there is no effect, we’ll say there is none
The odds of saying there is no relationship, difference, gain, when in fact there is one The odds of not confirming our theory when it’s true 20 times out of 100, when there is an effect, we’ll say there isn’t We reject H 0 ,accept H1. We say... "There is a relationship"; "There is a difference or gain"; "Our theory is correct"
The odds of saying there is an relationship, difference, gain, when in fact there is not The odds of confirming our theory incorrectly 5 times out of 100, when there is no effect, we’ll say there is on We should keep this small when we can’t afford/ risk wrongly concluding that our program works
The odds of saying that there is an relationship, difference, gain, when in fact there is one The odds of confirming our theory correctly 80 times out of 100, when there is an effect, we’ll say there is We generally want this to be as large as possible
Examples
Correlation Correlation is a measure of the relation between two or more variables The measurement scales used should be at least interval scales, but other correlation coefficients are available to handle other types of data Correlation coefficients can range from -1.00 to +1.
The types of correlation subject X Y Zx Zy ZxZy A 1 4 -1.5 -1.5 2. B 3 7 -1.0 -1.0 1. C 5 10 -0.5 -0.5 0. D 7 13 0 0 0. E 9 16 0.5 0.5 0. F 11 19 1.0 1.0 1. G 13 22 1.5 1.5 2. N=7 , X=7.0, Y=13.0, Sx=4.0, Sy=6.0, ∑X=49, ∑Y=91, ∑Zx=∑Zy=0.0, ∑ZxZy=7.00, ∑X^2 =455, ∑Y^2 = r=(∑ZxZy)/N=7.00/7=1. N=7 , X=7.0, Y=13.0, Sx=4.0, Sy=6.0, ∑X=49, ∑Y=91, ∑Zx=∑Zy=0.0, ∑ZxZy=7.00, ∑X^2 =455, ∑Y^2 = r=(∑ZxZy)/N=7.00/7=1. N=7 , X=7.0, Y=13.0, Sx=4.0, Sy=6.0, ∑X=49, ∑Y=91, ∑Zx=∑Zy=0.0, ∑ZxZy=7.00, ∑X^2 =455, ∑Y^2 = r=(∑ZxZy)/N=7.00/7=1. N=7 , X=7.0, Y=13.0, Sx=4.0, Sy=6.0, ∑X=49, ∑Y=91, ∑Zx=∑Zy=0.0, ∑ZxZy=7.00, ∑X^2 =455, ∑Y^2 = r=(∑ZxZy)/N=7.00/7=1. N=7 , X=7.0, Y=13.0, Sx=4.0, Sy=6.0, ∑X=49, ∑Y=91, ∑Zx=∑Zy=0.0, ∑ZxZy=7.00, ∑X^2 =455, ∑Y^2 = r=(∑ZxZy)/N=7.00/7=1. N=7 , X=7.0, Y=13.0, Sx=4.0, Sy=6.0, ∑X=49, ∑Y=91, ∑Zx=∑Zy=0.0, ∑ZxZy=7.00, ∑X^2 =455, ∑Y^2 = r=(∑ZxZy)/N=7.00/7=1. r = (∑ZxZy)/N Pearson r - Used when data
Spearman r s , the rank
r s =1-[6(∑D 2 )/N(N 2 -1)] IQ rank Leadership rank
1 4 -3 9 2 2 0 0 3 1 2 4 4 6 -2 4 5 3 2 4 6 5 1 1 ∑D=0 ∑D^2 = N=6, rs =1-[6(∑D^2 )/N(N^2 -1)]=1-[ 622/(635)]=1-.63=. 37 N=6, rs =1-[6(∑D^2 )/N(N^2 -1)]=1-[ 622/(635)]=1-.63=. 37 N=6, rs =1-[6(∑D^2 )/N(N^2 -1)]=1-[ 622/(635)]=1-.63=. 37 N=6, rs =1-[6(∑D^2 )/N(N^2 -1)]=1-[ 622/(635)]=1-.63=. 37 How to judge the significance of r?
ANOVA A statistical procedure developed by RA Fisher that allows one compare simultaneously the difference between two or more means One-way ANOVA … comparing the effects of different levels of a single independent variable Two-way ANOVA …comparing simultaneously the effects of two independent variables Between-Groups Variance… Estimate of variance between group means With-Groups Variance… Estimate of the average variance within each group Homogeneity of variance … the variance of the groups are equivalent to each other Basic concept
Work Example Group 1 Group 2 Group 3 Group 4 X (^) 1.1=4 X (^) 1.2=6 X (^) 1.3=4 X (^) 1.4= X (^) 2.1=2 X (^) 2.2=3 X (^) 2.3=5 X (^) 2.4= X (^) 3.1=1 X (^) 3.2=5 X (^) 3.3=7 X (^) 3.4= X (^) 4.1=3 X (^) 4.2=4 X (^) 4.3=6 X (^) 4.4= X (^) .1 = n 1 = x=2. ∑Xi1^2 = s 12 =1. X (^) .2 = n 2 = x=4. ∑Xi2^2 = s 22 =1. X (^) .3 = n 3 = x=5. ∑Xi3^2 = s 32 =1. X (^) .4 = n 4 = x=6. ∑Xi4^2 = s 42 =2. ∑Xij= N= X=4. ∑Xij^2 = F (^) max =2.00/1.67 =1. The degree of freedom df (^) total =N-1 =16-1 = df (^) between =k-1= 4-1 = df (^) within =∑(nj-1) = 12 F =MS (^) between /MS (^) within = 5. Result Table Source SS df MS F Between-groups 28.75 3 9.583 5. Within-groups 21.00 12 1. Total 49.75 15 SS total , SS within , SS between
Work Example Source SS df MS F Sig. of F Target 235.20 3 78.40 74.59. Device 86.47 2 43.23 41.13. Light 76.80 1 76.80 73.07. Target* Device 104.20 6 17.37 16.52. Target* Light 93.87 3 31.29 29.77.
Factor Analysis …a statistical technique used to reduce a set of variables to a smaller number of variables or factors. examines the pattern of intercorrelations between the variables, and determines whether there are subsets of variables (or factors) that correlate highly with each other but that show low correlations with other subsets (or factors). Variable: x 1, x 2, x 3, x 4, … x m Load Factor: z 1 , z 2 , z 3 , z 4 , … z n x 1 =b 11 z 1 +b 12 z 2 +b 13 z 3 +...+b 1n z n +e 1 ..... z 1 =a 11 x 1 +a 12 x 2 +a 13 x 3 +…+a 1n x m .......