Using Ancillary Data for Land-Cover Classification: A Landsat TM Case Study | Study Guides, Projects, Research Applications of Computer Sciences

Rule-Based Classification Systems Using

Classification and Regression Tree (CART) Analysis

Rick L. Lawrence and Andrea Wrlght

Incorporating ancillary data into image classification can

increase classification accuracy and precision. Rule-based

classification systems using expert systems or machine

learning are a particularly useful means of incorporating

ancillary data, but have been difficult to implement. We

developed a means for creating a rule-based classification

using classification and regression tree analysis

(CART),

commonly available statistical method. The

CART

classifica-

tion does not require expert knowledge, automatically selects

useful spectral and ancillary data from data supplied by the

analyst, and can be used with continuous and categorical

ancillary data. We demonstrated the use of the

CART

classi-

fication at three increasingly detailed classification levels for

a portion of the Greater Yellowstone Ecosystem. Overall

accuracies ranged from 96 percent at level

to 79 percent at

level

and 65 percent at level

Introduction

Ancillary data, either in addition to or derived from remotely

sensed data, has the potential for increasing classification

accuracy. Incorporation of ancillary data into classification

techniques, however, has been problematic. We developed a

straightforward approach for creating a rule-based classifica-

tion without expert knowledge by applying a commonly avail-

able statistical technique, classification and regression tree

(CART)

analysis, to multiple spectral and ancillary data layers.

Classiflcation and Regression Tree Analysis-Background

CART

is an increasingly popular form of statistical analysis

available through widely used statistical packages, such as

S-Plus (Venables and Ripley, 1997;

Mathsoft, 1998; Lawrence

and Ripple, 2000).

CART

operates by recursively splitting the

data until ending points, or terminal nodes, are achieved using

preset criteria.

CART

therefore begins by analyzing all explana-

tory variables and determining which binary division of a sin-

gle explanatory variable best reduces deviance in the response

variable (Breiman et al., 1984; Efron and Tibshirani, 1991;

Ven-

ables and Ripley, 1997). In the case of image classification,

explanatory variables consist of spectral and ancillary data,

whether continuous or categorical, and the response variable

is the land-coverlland-use class list.

For each portion of the data resulting from this first split,

the process is repeated, continuing until homogeneous termi-

nal nodes are reached in a hierarchical tree. In the S-Plus imple-

mentation of

CART,

terminal nodes are defined when either the

total number of observations at the node is less than ten or the

R.L. Lawrence is with the Mountain Research Center, Depart-

ment of Land Resources and Environmental Sciences,

P.O.

Box 173490, Montana State University, Bozeman, MT 59717-

3490 (rickl8montana.edu).

Wright is with the Center for the Environment, Cornell

University, Ithaca,

14853 (awp98cornell.edu).

deviance at the node is less than

percent of the total deviance

for the entire tree (Venables and Ripley, 1997).

CART

usually will over-fit the model, creating a tree that

explains substantially all of the deviance in the original data,

but in a manner that is specific to the particular data used to fit

the tree. It is necessary, therefore, to prune the tree back to a

level where the tree can reasonably be expected to be robust.

common method used for pruning, and a method implemented

in S-Plus, involves cross validation (Venables and Ripley,

1997). In this method, the original data are randomly divided

into ten equal sets. Trees are generated for nine of the data sets

and validated against the tenth, with the minimum average

deviance indicating the best size tree. The analyst might select

a smaller tree if the cross-validation method indicates that,

although additional deviance can be reduced, the amount of

reduction does not justify an overly complex tree.

The result of the

CART

analysis is a dichotomous decision

or classification tree. Each path through the tree, defined by a

series of dichotomous splits, specifies the conditions that lead

to a most probable class. The tree, therefore, might be viewed

as a series of rules that can be used for unknown observations to

predict likely class membership. When used with remotely

sensed and ancillary data, this naturally extends to a rule-based

classification scheme.

Ancillary Data Incorporation In Classification-Background

Traditional methods of land-uselland-cover classification

using satellite imagery have relied solely on the spectral infor-

mation present in the images. With purely spectral approaches,

the spectral and spatial resolutions of the imagery are the pri-

mary determinants of the level of classification detail that can

be achieved. For example, given the spectral and spatial reso-

lution of Landsat Thematic Mapper

(TM)

imagery, such images

have generally been considered adequate for mapping

uSGS

level

(Jensen and Cowen, 1999). By using ancillary data in

addition to spectral responses, however, it might be possible to

achieve either greater classification detail or greater classifica-

tion accuracy for a given combination of spectral and spatial

resolutions.

Classification techniques using ancillary data in addition

to spectral data have demonstrated that, in many cases, the

proper addition of ancillary data to spectral data can lead to

greater class distinctions (e.g., Strahler et al., 1978; Hutchen-

son, 1982; Trotter, 1991; Jensen, 1996). Ancillary data generally

is derived from

GIS

layers, such as digital elevation models, but

might also include information derived from the imagery, such

as texture information or multi-date composites. Initially,

Photogrammetric Engineering

Remote Sensing

Vol. 67, No. 10, October 2001, pp. 1137-1142.

0099-lllZ/01/6710-1137$3.00/0

2001 American Society for Photogrammetry

and Remote Sensing

PHOTOGRAMMETRIC ENGINEERING

REMOTE SENSING

October

2001

1137

Using Ancillary Data for Land-Cover Classification: A Landsat TM Case Study, Study Guides, Projects, Research of Applications of Computer Sciences

Related documents

Partial preview of the text

Download Using Ancillary Data for Land-Cover Classification: A Landsat TM Case Study and more Study Guides, Projects, Research Applications of Computer Sciences in PDF only on Docsity!

Rule-Based Classification Systems Using

Classification and Regression Tree (CART) Analysis

Introduction

used (e.g., Strahler et al., 1978;Elumnoh and Shrestha, 2000;

Study Area

km2, the portion of the GYE included in this study spans two TM

TAELE 6. DICHOTOMOUSCLASSIFICATIONTREES FOR LEVEL 3 CLASSIFICATIONS

1.1. Slope gradient < 5"

1.1.2.1. June TM band 1 < 62.5, THEN Cottonwood

1.1.2.2. June TM band 1 > 62.5, THEN Cottonwood

4.2.2. August TM band 6 > 133.

sification. With expert systems, a priori knowledge is neces-

criminately. As with any statistical analysis, the uncritical

Acknowledgments