Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Unit-1 Parametric and Non-parametric Statistics, Study Guides, Projects, Research of Statistics

Certain assumptions are associated with most non-parametric statistical tests, namely that the observation are independent, perhaps that variable under study ...

Typology: Study Guides, Projects, Research

2021/2022

Uploaded on 09/12/2022

jeanette
jeanette 🇬🇧

3.7

(7)

238 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
UNIT 1 PARAMETRIC AND NON-
PARAMETRIC STATISTICS
Structure
1.0 Introduction
1.1 Objectives
1.2 Definition of Parametric and Non-parametric Statistics
1.3 Assumptions of Parametric and Non-parametric Statistics
1.3.1 Assumptions of Parametric Statistics
1.3.2 Assumptions of Non-parametric Statistics
1.4 Advantages of Non-parametric Statistics
1.5 Disadvantages of Non-parametric Statistical Tests
1.6 Parametric Statistical Tests for Different Samples
1.7 Parametric Statistical Measures for Calculating the Difference Between Means
1.7.1 Significance of Difference Between the Means of Two Independent Large and
Small Samples
1.7.2 Significance of the Difference Between the Means of Two Dependent Samples
1.7.3 Significance of the Difference Between the Means of Three or More Samples
1.8 Parametric Statistics Measures Related to Pearson’s ‘r’
1.8.1 Non-parametric Tests Used for Inference
1.9 Some Non-parametric Tests for Related Samples
1.10 Let Us Sum Up
1.11 Unit End Questions
1.12 Glossary
1.13 Suggested Readings
1.0 INTRODUCTION
In this unit you will be able to know the various aspects of parametric and non-
parametric statistics. A parametric statistical test specifies certain conditions such as
the data should be normally distributed etc. The non-parametric statistics does not
require the conditions of parametric stats. In fact non-parametric tests are known
as distribution free tests.
In this unit we will study the nature of quantitative data and various descriptive
statistical measures which are used in the analysis of such data. These include measures
of central tendency, variability, relative position and relationships of normal probability
curve etc. will be explained.
The computed values of various statistics are used to describe the properties of
particular samples. In this unit we shall discuss inferential or sampling statistics, which
are useful to a researcher in making generalisations of inferences about the populations
from the observations of the characteristics of samples.
For making inferences about various population values (parameters), we generally 5
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Unit-1 Parametric and Non-parametric Statistics and more Study Guides, Projects, Research Statistics in PDF only on Docsity!

UNIT 1 PARAMETRIC AND NON-

PARAMETRIC STATISTICS

Structure

1.0 Introduction

1.1 Objectives

1.2 Definition of Parametric and Non-parametric Statistics

1.3 Assumptions of Parametric and Non-parametric Statistics

1.3.1 Assumptions of Parametric Statistics 1.3.2 Assumptions of Non-parametric Statistics

1.4 Advantages of Non-parametric Statistics

1.5 Disadvantages of Non-parametric Statistical Tests

1.6 Parametric Statistical Tests for Different Samples

1.7 Parametric Statistical Measures for Calculating the Difference Between Means

1.7.1 Significance of Difference Between the Means of Two Independent Large and Small Samples 1.7.2 Significance of the Difference Between the Means of Two Dependent Samples 1.7.3 Significance of the Difference Between the Means of Three or More Samples

1.8 Parametric Statistics Measures Related to Pearson’s ‘r’

1.8.1 Non-parametric Tests Used for Inference

1.9 Some Non-parametric Tests for Related Samples

1.10 Let Us Sum Up

1.11 Unit End Questions

1.12 Glossary

1.13 Suggested Readings

1.0 INTRODUCTION

In this unit you will be able to know the various aspects of parametric and non- parametric statistics. A parametric statistical test specifies certain conditions such as the data should be normally distributed etc. The non-parametric statistics does not require the conditions of parametric stats. In fact non-parametric tests are known as distribution free tests.

In this unit we will study the nature of quantitative data and various descriptive statistical measures which are used in the analysis of such data. These include measures of central tendency, variability, relative position and relationships of normal probability curve etc. will be explained.

The computed values of various statistics are used to describe the properties of particular samples. In this unit we shall discuss inferential or sampling statistics, which are useful to a researcher in making generalisations of inferences about the populations from the observations of the characteristics of samples.

For making inferences about various population values (parameters), we generally (^5)

Introduction to Statistics

6

make use of parametric and non-parametric tests. The concept and assumptions of parametric tests will be explained to you in this section along with the inference regarding the means and correlations of large and small samples, and significance of the difference between the means and correlations in large and small independent samples.

The assumptions and applications of analysis of variance and co-variance for testing the significance of the difference between the means of three or more samples will also be discussed.

In the use of parametric tests for making statistical inferences, we need to take into account certain assumptions about the nature of the population distribution, and also the type of the measurement scale used to quantify the data. In this unit you will learn about another category of tests which do not make stringent assumptions about the nature of the population distribution. This category of test is called distribution free or non-parametric tests. The use and application of several non-parametric tests involving unrelated and related samples will be explained in this unit. These would include chi-square test, median test, Man-Whitney U test, sign test and Wilcoxon- matched pairs signed-ranks test.

1.1 OBJECTIVES

After reading this unit, you will be able to:

z define the terms parametric and non-parametric statistics;

z differentiate between parametric and non-parametric statistics;

z describe the nature and meaning of parametric and non-parametric statistics;

z delineate the assumptions of parametric and non-parametric statistics; and

z list the advantages and disadvantages of parametric and non-parametric statistics.

1.2 DEFINITION OF PARAMETRIC AND NON-

PARAMETRIC STATISTICS

Statistics is an Independent branch and its use is highly prevalent in all the fields of knowledge. Many methods and techniques are used in statistics. These have been grouped under parametric and and non-parametric statistics. Statistical tests which are not based on a normal distribution of data or on any other assumption are also known as distribution-free tests and the data are generally ranked or grouped. Examples include the chi-square test and Spearman’s rank correlation coefficient.

The first meaning of non-parametric covers techniques that do not rely on data belonging to any particular distribution. These include, among others:

  1. Distribution free methods: This means that there are no assumptions that the data have been drawn from a normally distributed population. This consists of non-parametric statistical models , inference and statistical tests.

  2. Non-parametric statistics: In this the statistics is based on the ranks of observations and do not depend on any distribution of the population.

  3. No assumption of a structure of a model: In non-parametric statistics, the techniques do not assume that the structure of a model is fixed. In this, the

Introduction to Statistics

8

arbitrary labels such as m / f and 1 //0. These are also called as categorical scale , that is these are scales with values that are in terms of categories (i.e. they are names rather than numbers).

  1. Ordinal scale deals with interval data. These are in certain order but the differences between values are not important. For example, degree of satisfaction ranging in a 5 point scale of 1 to 5, with 1 indicating least satisfaction and 5 indicating high satisfaction.

  2. Interval scale deals with ordered data with interval. This is a constant scale but has no natural zero. Differences do make sense. Example of this kind of data includes for instance temperature in Centigrade or Fahrenheit. The dates in a calendar. Interval scale possesses two out of three important requirements of a good measurement scale, that is, magnitude and equal intervals but lacks the real or absolute zero point.

  3. Ratio scale deals with ordered, constant scale with a natural zero. Example of this type of data include for instance, height, weight, age, length etc.

The sample with small number of items are treated with non-parametric statistics because of the absence of normal distribution, e.g. if our sample size is 30 or less; (N (^) ≤ 30). It can be used even for nominal data along with the ordinal data.

A non-parametric statistical test is based on model that specifies only very general conditions and none regarding the specific form of the distribution from which the sample was drawn.

Certain assumptions are associated with most non-parametric statistical tests, namely that the observation are independent, perhaps that variable under study had underlying continuity, but these assumptions are fewer and weaker than those associated with parametric tests.

More over as we shall see, non-parametric procedures often test different hypotheses about population than do parametric procedures.

Finally, unlike parametric tests, there are non-parametric procedures that may be applied appropriately to data measured in an ordinal scale, or in a nominal scale or categorical scale.

Non-parametric statistics deals with small sample sizes.

Non-parametric statistics are assumption free meaning these are not bound by any assumptions.

Non-parametric statistics are user friendly compared with parametric statistics and economical in time.

We have learnt that parametric tests are generally quite robust and are useful even when some of their mathematical assumptions are violated. However, these tests are used only with the data based upon ratio or interval measurements.

In case of counted or ranked data, we make use of non-parametric tests. It is argued that non-parametric tests have greater merit because their validity is not based upon assumptions about the nature of the population distribution, assumptions that are so frequently ignored or violated by researchers using parametric tests. It may be noted that non-parametric tests are less precise and have less power than the parametric tests.

9

1.3 ASSUMPTIONS OF PARAMETRIC AND NON-

PARAMETRIC STATISTICS

1.3.1 Assumptions of Parametric Statistics

Parametric tests like, ‘t and f’ tests may be used for analysing the data which satisfy the following conditions :

The population from which the sample have been drawn should be normally distributed.

Normal Distributions refer to Frequency distribution following a normal curve, which is infinite at both the ends.

The variables involved must have been measured interval or ratio scale.

Variable and its types: characteristic that can have different values.

Types of Variables

Dependent Variable: Variable considered to be an effect; usually a measured variable. Independent Variable: Variable considered being a cause.

The observation must be independent. The inclusion or exclusion of any case in the sample should not unduly affect the results of study.

These populations must have the same variance or, in special cases, must have a known ratio of variance. This we call homosedasticity.

The samples have equal or nearly equal variances. This condition is known as equality or homogeneity of variances and is particularly important to determine when the samples are small.

The observations are independent. The selection of one case in the sample is not dependent upon the selection of any other case.

1.3.2 Assumptions of Non-parametric Statistics

We face many situations where we can not meet the assumptions and conditions and thus cannot use parametric statistical procedures. In such situation we are bound to apply non-parametric statistics.

If our sample is in the form of nominal or ordinal scale and the distribution of sample is not normally distributed, and also the sample size is very small, it is always advisable to make use of the non-parametric tests for comparing samples and to make inferences or test the significance or trust worthiness of the computed statistics. In other words, the use of non-parametric tests is recommended in the following situations:

Where sample size is quite small. If the size of the sample is as small as N=5 or N=6, the only alternative is to make use of non-parametric tests.

When assumption like normality of the distribution of scores in the population are doubtful, we use non-parametric tests.

When the measurement of data is available either in the form of ordinal or nominal scales or when the data can be expressed in the form of ranks or in the shape of

  • signs or – signs and classification like “good-bad”, etc., we use non-parametric statistics.

Parametric and Non- parametric Statistics

1 1

1.6 PARAMETRIC STATISTICAL TESTS FOR

DIFFERENT SAMPLES

Suppose we wish to measure teaching aptitude of M.A. Psychology Students(LARGE SAMPLE) by using a verbal aptitude teaching test.

It is not possible and convenient to measure the teaching aptitude of all the enrolled M.A. Psychology Students trainees and hence we must usually be satisfied with a sample drawn from this population.

However, this sample should be as large and as randomly drawn as possible so as to represent adequately all the M.A. Psychology Students of IGNOU.

If we select a large number of random samples of 100 trainees each from the population of all trainees, the mean values of teaching aptitude scores for all samples would not be identical.

A few would be relatively high, a few relatively low, but most of them would tend to cluster around the population mean.

The sample means due to ‘sampling error’ will not vary from sample to sample but will also usually deviate from the population mean. Each of these sample means can be treated as a single observation and these means can be put in a frequency distribution which is known as sampling distribution of the means.

An important principle, known as the ‘Central Limit Theorem’, describes the characteristics of sample means. According to this theorem, if a large number of equal-sized samples, greater than 30 in size, are selected at random from an infinite population:

The means of the samples will be normally distributed.

The average value of the sample means will be the same as the mean of the population.

The distribution of sample means will have its own standard deviation.

This standard deviation is known as the ‘standard error of the mean’ which is denoted as SEM or ó (^) M.

It gives us a clue as to how far such sample means may be expected to deviate from the population mean.

The standard error of a mean tells us how large the errors are in any particular sampling situation.

The formula for the standard error of the mean in a large sample is:

SE (^) M or óM = ó / (^) N

Where

ó = the standard deviation of the population

N = the size of the sample

In case of small samples , the sampling distribution of means is not normal. It was in about 1815 when William Seely Gosset developed the concept of small sample size. He found that the distribution curves of small sample means were some what different from the normal curve. This distribution was named as t-distribution. When the size of the sample is small, the t-distribution lies under the normal curve.

Parametric and Non- parametric Statistics

Introduction to Statistics

1 2

1.7 PARAMETRIC STATISTICAL MEASURES FOR

CALCULATING DIFFERENCE BETWEEN

MEANS

In some research situations we require the use of a statistical technique to determine whether a true difference exists between the population parameters of two samples. The parameters may be means, standard deviations, correlations etc. For example, suppose we wish to determine whether the population of male M.A. Psychology Students enrolled with IGNOU differs from their female counterparts in their attitude towards teaching… In this case we would first draw samples of male and female M.A. Psychology Students. Next, we would administer an attitude scale measuring attitude towards teaching on the selected samples, compute the means of the two samples, and find the difference between them. Let the mean of the male sample be 55 and that of the females 59. Then it has to be ascertained if the difference of 4 between the sample means is large enough to be taken as real and not due only to sampling error or chance.

In order to test the significance of the obtained difference of 4, we need to first find out the standard error of the difference of the two means because it is reasonable to expect that the difference between two means will be subject to sampling errors. Then from the difference between the sample means and its standard error we can determine whether a difference probably exists between the population means.

In the following sections we will discuss the procedure of testing the significance of the difference between the means and correlations of the samples.

1.7.1 Significance of the Difference between the Means of

Two Independent Large and Small Samples

Means are said to be independent or uncorrelated when computed from samples drawn at random from totally different and unrelated groups.

Large Samples

You have leant that the frequency distribution of large sample means, drawn from the same population, fall into a normal distribution around the population mean (Mpop) as their measure of central tendency. It is reasonable to expect that the frequency distribution of the difference between the means computed from the samples drawn from two different populations will also tend to be normal with a mean of zero and standard deviation which is called the standard error of the difference of means.

The standard error is denoted by ódm which is estimated from the standard errors of the two sample means, óm1 and óm2. The formula is:

ó (^) dM = (óM1 2 + ó (^) M1 2)Under root

in which

ó (^) M1 = SE of the mean of the first sample

ó (^) Mw = SE of the mean of the second sample

N 1 = Number of cases in first sample

N 2 = Number of cases in second sample

Introduction to Statistics

1 4

1.8 PARAMETRIC STATISTICS MEASURES

RELATED TO PEARSON’S ‘r’

The mathematical basis for standard error of a Pearson’s co-efficient of correlation ‘r’ is rather complicated because of the difficulty in its nature of sampling distribution. The sampling distribution of r is not normal except when population r is near zero and size of the sample is large (N=30 or greater).

When r is high (0.80 or more) and N is small, the sampling distribution of r is skewed. It is also true when r is low (0.20 or less).

In view of this, a sound method for making the inference regarding Pearson’s r, especially when its magnitude is very high or very low, is to convert r into Fisher’s Z coefficient using conversion table provided in the Appendix (Statistics book) and find the standard error (SE) of Z.

The sampling distribution of Z co-efficient is normal regardless of the size of sample N and the size of the population r. Furthermore, the SE of Z depends only upon the size of sample N.

The formula for standard error of Z (ó (^) z ) is:

SE (^) z = 1/ (^) N–

The method of determining the standard error of the difference between Pearson’s co-efficient of correlation of two samples is first to convert the r’s into Fisher’s Z co-efficient and then to determine the significance of the difference between the two Z’s.

When we have two correlations between the same two variables, X and Y, computed from two totally different and unmatched samples, the standard error of a difference between two corresponding Z’s is computed by the formula:

SE (^) dz= ó (^) z1-z2 = (^) (1/N1– 3 + 1/ N (^) 2–3)

in which N 1 and N 2 = sizes of the two samples

The significance of the difference between the two Z’s is tested with the following formula:

CR = Z 1 - Z 2 / SEDZ

1.8.1 Non-parametric Tests Used for Inference

The most frequently used non-parametric tests for drawing statistical inferences in case of unrelated or independent samples are:

  1. Chi square test;

  2. Median test; and

  3. Mann-Whitney ‘U’ test. The use and application of these tests are discussed below:

The Chi Square (X^2 ) Test The chi square test is applied only to discrete data. The data that are counted rather

1 5

than measured. It is a test of independence and is used to estimate the likelihood that some factor other than chance accounts for the observed relationship.

The Chi square (X^2 ) is not a measure of the degree of relationship between the variables under study.

The Chi square test merely evaluates the probability that the observed relationship results from chance. The basic assumption, as in case of other statistical significance, is that the sample observations have been randomly selected.

The formula for chi-square (X^2 ) is:

(X 2 ) = ∑[(fo-fe)2 / fe]

In which

Fo = frequency of occurrence of observed or experimentally determined facts.

Fe = expected frequency of occurrence.

The Median Test

The median test is used for testing whether two independent samples differ in central tendencies. It gives information as to whether it is likely that two independent samples have been drawn from populations with the same median. It is particularly useful when even the measurements for the two samples are expressed in an ordinal scale. In using the median test, we first calculate the combined median for all measures (scores) in both samples. Then both sets of scores at the combined median are dichotomized and the data are set in a 2 x 2 table with two rows one containing below median and the other row containing above median. On the column side we have two columns, one containing the sample 1 and the other column containing sample 2.

The Mann-Whitney U Test

The Mann-Whitney U test is more useful than the Median test. It is one of the most useful alternative to the parametric t test when the parametric assumptions cannot be met and when the measurements are expressed in ordinal scale values.

1.9 SOME NON-PARAMETRIC TESTS FOR

RELATED SAMPLES

Various tests are used in drawing statistical inferences in case of related samples. In this section we shall confine our discussion to the use of Sign Test and Wilcoxon Matched-Pairs Signed-Ranks Test Only

The Sign Test

The sign test is the simplest test of significance in the category of non-parametric tests. It makes use of plus and minus signs rather than quantitative measures as its data. It is particularly useful in situations in which quantitative measurement is impossible or inconvenient, but on the basis of superior or inferior performance it is possible to rank with respect to each other, the two members of each pair.

The sign test is used either in the case of single sample from which observations are obtained under two experimental conditions. The researcher wants to establish that the two conditions are different.

Parametric and Non- parametric Statistics

1 7

Assumptions : Prerequisite conditions

Population : Larger group of people to which inferences are made.

Sample : Small proportion of the population which we assert representing population.

Normal Curve : Bell shaped frequency distribution that is symmetrical and unimodel.

Distribution free tests : Hypothesis – testing procedure making non assumptions about population parameters.

Categorical Scale : Variable with values that are categories that is, they are name rather than numbers.

Test : Test is a tool to measure observable behaviour

Homosedasity : Populations must have some variance or in special cases must have a known ratio of variance.

1.13 SUGGESTED READINGS

Asthana H.S, and Bhushan. B. (2007) Statistics for Social Sciences (with SPSS Applications).

B.L. Aggrawal (2009). Basic Statistics. New Age International Publisher, Delhi.

Guilford, J.P. (1965); Fundamental Statistics in Psychology and Education. New York: McGraw Hill Book Company.

Siegel, S. (1956): Non-parametric Statistics for Behavioural Sciences. Tokyo: McGraw Hill Hoga Kunsa Ltd.

Sidney Siegel, & N. John Castetellan, Jr. (1958) Non-parametric Statistics for the Behavioural Science. McGraw Hill Books company, New Delhi

Parametric and Non- parametric Statistics