Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Extracting Semantic Hierarchies from Dictionaries: Challenges and Solutions, Exercises of English Literature

The challenges and potential solutions for extracting semantic hierarchies from dictionaries to build large lexical-semantic databases for Natural Language Processing (NLP) systems. The text focuses on the use of everyday dictionaries and the issues of incomplete information and a lack of distinction among terms. The document also proposes methods for extracting hypernyms and handling circular definitions.

Typology: Exercises

2021/2022

Uploaded on 08/01/2022

hal_s95
hal_s95 šŸ‡µšŸ‡­

4.4

(652)

10K documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
AN ASSESSMENT OF SEMANTIC INFORMATION AUTOMATICALLY
EXTRACTED FROM MACHINE READABLE DICTIONARIES
Jean V~ronis 1.2and Nancy Ide t
tDepartrnent of Computer Science
VASSAR COLLEGE
Poughkeepsie, New York 12601 (U.S.A.)
:~Groupe Representation et Traitement des Connalssances
CF_.~E NATIONAL DE LA RECHERCHE SCIENTIFIQUE
31, Ch. Joseph Aiguier
13402 Marseille Cedex 09 (France)
ABSTRACT
In this paper we provide a quantitative evaluation of
information automatically extracted from machine
readable dictionaries. Our results show that for any one
dictionary, 55-70% of the extracted information is
garbled in some way. However, we show that these
results can be dramatically reduced to about 6% by
combining the information extracted from five
dictionaries. It therefore appears that even if individual
dictionaries are an unreliable source of semantic
information, multiple dictionaries can play an important
role in building large lexical-semantic databases.
1. INTRODUCTION
In recent years, it has become increasingly clear that the
limited size of existing computational lexicons and the
poverty of the semantic information they contain
represents one of the primary bottlenecks in the
development of realistic natural language processing
(NLP) systems. The need for extensive lexical and
semantic databases is evident in the recent initiation of a
number of projects to construct massive generic
lexicons for NLP (project GENELEX in Europe or
EDR in Japan).
The manual coustruction of large lexical-semantic
databases demands enormous human resources, and
there is a growing body of research into the possibility
of automatically extracting at least a part of the required
lexical and semantic informati'on from everyday
dictionaries. Everyday dictionaries are obviously not
structured in a way that enables their immediate use in
NLP systems, but several Studies have shown that
relatively simple procedures can be used to extract
taxonomies and various other semantic relations (for
example, Amsler, 1980; Calzolari, 1984; Cbodorow,
Byrd, and Heidorn, 1985; Markowitz, Ahlswede, and
Evens, 1986; Byrd et al., 1987; Nakamura and Nagao,
1988; Vtronis and Ide, 1990~ Klavans, Chodorow, and
Wacholder, 1990; Wilks et al., 1990).
However, it remains to be seen whether information
automatically extracted from dictionaries is sufficiently
complete and coherent to be actually usable in NLP
systems. Although there is concern over the quality of
automatically extracted lexical information, very few
empirical studies have attempted to assess it
systematically, and those that have done so have been
restricted to consideration of the quality of grammatical
information (e.g., Akkerman, Masereeuw, and Meijs,
1985). No evaluation of automatically extracted
semantic information has been published.
The authors would like to thank Lisa Lassck and Anne Gilman
for
their
contribution to this work.
In this paper, we report the results of a quantitative
evaluation of automatically extracted sernanuc data. Our
results show that for any one dictionary, 55-70% of the
extracted information is garbled in some way. These
results at first call into doubt the validity of automatic
extraction from dictionaries. However, in section 4 we
show that these results can be dramatically reduced to
about 6% by several means--most significantly, by
combining the information extracted from five
dictionaries. It therefore appears that even if individual
dictionaries are an unreliable source of semantic
information, multiple dictionaries can play an important
role in building large lexical-semantic databases.
2. METHODOLOGY
Our strategy involves automatically extracting
hypernyms from five English dictionaries for a limited
corpus. To determine where problems exist, the
resulting hierarchies for each dictionary are compared to
an "ideal" hierarchy constructed by hand. The five
dictionaries compared were: the Collins English
Dictionary (CED), the Oxford Advanced Learner's
Dictionary (OALD), the COBUILD Dictionary, the
Longman's Dictionary of Contemporary English
(LDOCE) and the Webster's 9th Dictionary (W9).
We begin with the most straightforward case in order to
determine an upper bound for the results. We deal with
words within a domain which poses few modelling
problems, and we focus on hyperonymy, which is
probably the least arguable semantic relation and has
been shown to be the easiest to extract. If the results are
poor under such favorable constraints, we can foresee
that they will be poorer for more complex (abstract)
domains and less clearly cut relations.
An ideal hicrarchy probably does not exist for the entire
dictionary; however, a fair degree of consensus seems
possible for carefully chosen terms within a very
restricted domain. We have therefore selected a corpus
of one hundred kitchen utensil terms, each representing
a concrete, individual object--for example,
cup, fork,
saucepan, decanter,
etc. All of the terms are count
nouns. Mass nouns, which can cause problems, have
been excluded (for example, the mass noun
cutlery
is
not a hypernym of
knife).
Other idiosyncratic cases,
such as
chopsticks
(where it is not clear if the utensil is
one object or a pair of objects) have also been
eliminated from the corpus. This makes it easy to apply
simple tests for hyperonymy, which, for instance,
enable us to say that Y is a hypcmym of X if "this is an
X"
entails but is not entailed by "this is
a Y"
(Lyons,
1963).
Chodorow, Byrd, and Heidorn (1985) proposed a
heuristic for extracting hypernyms which exploits the
fact that definitions for nouns typically give a hypemym
- 227 -
pf3
pf4
pf5

Partial preview of the text

Download Extracting Semantic Hierarchies from Dictionaries: Challenges and Solutions and more Exercises English Literature in PDF only on Docsity!

A N A S S E S S M E N T O F S E M A N T I C I N F O R M A T I O N A U T O M A T I C A L L Y

E X T R A C T E D F R O M M A C H I N E R E A D A B L E D I C T I O N A R I E S

J e a n V ~ r o n i s 1.2and N a n c y I d e t

tDepartrnent of Computer Science VASSAR COLLEGE Poughkeepsie, New York 12601 (U.S.A.) :~Groupe Representation et Traitement des Connalssances CF_.~E NATIONALDE LA RECHERCHESCIENTIFIQUE 31, Ch. Joseph Aiguier 13402 Marseille Cedex 09 (France)

A B S T R A C T

In this paper we provide a quantitative evaluation of information automatically extracted from machine readable dictionaries. Our results show that for any one dictionary, 55-70% of the extracted information is garbled in some way. However, we show that these results can be dramatically reduced to about 6% by combining the information extracted from five dictionaries. It therefore appears that even if individual dictionaries are an unreliable source of semantic information, multiple dictionaries can play an important role in building large lexical-semantic databases.

1. I N T R O D U C T I O N

In recent years, it has become increasingly clear that the limited size of existing computational lexicons and the poverty of the semantic information they contain represents one of the primary bottlenecks in the development of realistic natural language processing (NLP) systems. The need for extensive lexical and semantic databases is evident in the recent initiation of a number of projects to construct massive generic lexicons for NLP (project GENELEX in Europe or EDR in Japan). The manual coustruction of large lexical-semantic databases demands enormous human resources, and there is a growing body of research into the possibility of automatically extracting at least a part of the required lexical and semantic informati'on from everyday dictionaries. Everyday dictionaries are obviously not structured in a way that enables their immediate use in NLP systems, but several Studies have shown that relatively simple procedures can be used to extract taxonomies and various other semantic relations (for example, Amsler, 1980; Calzolari, 1984; Cbodorow, Byrd, and Heidorn, 1985; Markowitz, Ahlswede, and

Evens, 1986; Byrd et al., 1987; Nakamura and Nagao,

1988; Vtronis and Ide, 1990~ Klavans, Chodorow, and

Wacholder, 1990; Wilks et al., 1990).

However, it remains to be seen whether information automatically extracted from dictionaries is sufficiently complete and coherent to be actually usable in NLP systems. Although there is concern over the quality of automatically extracted lexical information, very few empirical studies have attempted to assess it systematically, and those that have done so have been restricted to consideration of the quality of grammatical information (e.g., Akkerman, Masereeuw, and Meijs, 1985). No evaluation of automatically extracted semantic information has been published.

The authors would like to thank Lisa Lassck and Anne Gilman for their contribution to this work.

In this paper, we report the results of a quantitative evaluation of automatically extracted sernanuc data. Our results show that for any one dictionary, 55-70% of the extracted information is garbled in some way. These results at first call into doubt the validity of automatic extraction from dictionaries. However, in section 4 we show that these results can be dramatically reduced to about 6% by several means--most significantly, by combining t h e information extracted from five dictionaries. It therefore appears that even if individual dictionaries are an unreliable source of semantic information, multiple dictionaries can play an important role in building large lexical-semantic databases.

2. M E T H O D O L O G Y

Our strategy involves automatically extracting hypernyms from five English dictionaries for a limited corpus. To determine where problems exist, the resulting hierarchies for each dictionary are compared to an "ideal" hierarchy constructed by hand. The five

dictionaries compared were: the Collins English

Dictionary (CED), the Oxford Advanced Learner's

Dictionary (OALD), the COBUILD Dictionary, the

Longman's Dictionary of Contemporary English

(LDOCE) and the Webster's 9th Dictionary (W9).

We begin with the most straightforward case in order to determine an upper bound for the results. We deal with words within a domain which poses few modelling problems, and we focus on hyperonymy, which is probably the least arguable semantic relation and has been shown to be the easiest to extract. If the results are poor under such favorable constraints, we can foresee that they will be poorer for more complex (abstract) domains and less clearly cut relations. An ideal hicrarchy probably does not exist for the entire dictionary; however, a fair degree of consensus seems possible for carefully chosen terms within a very restricted domain. We have therefore selected a corpus of one hundred kitchen utensil terms, each representing a concrete, individual object--for example, cup, fork, saucepan, decanter, etc. All of the terms are count nouns. Mass nouns, which can cause problems, have been excluded (for example, the mass noun cutlery is not a hypernym of knife). Other idiosyncratic cases, such as chopsticks (where it is not clear if the utensil is one object o r a pair of objects) have also been eliminated from the corpus. This makes it easy to apply simple tests for hyperonymy, which, for instance, enable us to say that Y is a hypcmym of X if "this is an X" entails but is not entailed by "this is a Y" (Lyons, 1963). Chodorow, Byrd, and Heidorn (1985) proposed a heuristic for extracting hypernyms which exploits the fact that definitions for nouns typically give a hypemym

term as the head of the defining noun phrase. Consider the following examples:

d i p p e r a ladle used for dipping... ICEDi ladle a long-handled spoon... ICED] s p o o n a metal, wooden, or plastic utensil... ICED]

In very general terms, the heuristic consists of extracting the word which precedes the first preposition, relative pronoun, or participle encountered in the definition text. When this word Is "empty" (e.g. one, any, kind, class) the true hyperuym is the head of the noun phrase following the preposition of'.

slice any of various utensils... [CEDI

Automatically extracted hierarchies are necessarily tangled (Amsler, 1980) because many words are polysemous. For example, in the CED, the word pan has the following senses (among others):

pan! l.a a wide metal vessel... ICEDI pan 2 1 the leaf of the betel tree.., iCED]

The CED also gives pan as the hypemym for saucepan, which taken together yields the hierarchy in figure l.a. The tangled hierarchy is problematic because, following the path upwards from saucepan, we find that saucepan can be a kind of leaf. This is clearly erroneous. A hierarchy utilizing senses rather than words would not be tangled, as shown in figure 1.b. In our study, the hierarchy waS disambiguated by hand. Sense disambiguation in dictionary definitions is a difficult problem, and we will not address it here; this problem is the focus of much current research and is considered in depth elsewhere (e.g., Byrd et al., 1987; Byrd, 1989; Vtronis and Ide, 1990; Klavans, Chodorow, and Wacholder, 1990; Wilks et al., 1990).

vessel leaf vessel I leaf l

I I saucepan saucepan I

a) v,,ordhitrarchy b) sense hierarchy

F i g u r e I : Sense-tangled" hierarchy

3. E V A L U A T I O N

Hierarchies constructed with methods such as those outlined in section 2 show, upon close inspection, several serious problems. In this section, we describe thc most pervasive problems and give their frequency in our five dictionaries. The problems fall into two general types: those which arise because information in the dictionary is incomplete, and those which are the result of a lack of distinction among terms and the lack of a one-to-one mapping between terms and concepts, especially at the highest levels of the hierarchy.

3.1. I n c o m p l e t e information The information in dictionaries is incomplete for two main reasons. First, since a dictionary is typically the product of several lexicographers' efforts and is constructed, revised, and updated over many years, there exist inconsistencies in the criteria by which the hypernyms given in definition texts are chosen. In addition, space and readability restrictions, on the one hand, and syntactic restrictions on phrasing, on the other, may dictate that certain information is unspecified in definition texts or left to be implied by other parts of the definition.

3.1.1. Attachment too high : 21-34% The most pervasive problem in automatically extracted hierarchies is the attachment of terms too high in the hierarchy. It occurs in 21-349'0 of the definitions in our sample from the five dictionaries (figure 8). For example, while pan and bottle are vessels in the CED, cup and bowl are simply containers, the hypemym of vessel. Obviously, "this is a cup" and "this is a bowl" both entail (and are not entailed by) "this is a vessel". Further, other dictionaries give vessel as the hypemym for cup and bowl. Therefore, the attachment of cup and bowl to the higher-level term container seems to be an inconsistency within the CED. The problem of attachment too high in the hierarchy occurs relatively randomly within a given dictionary. In dictionaries with a controlled definition vocabulary (such as the LDOCE), the problem of attachment at high levels of thehierarchy results also from a lack of terms from which to choose. For example, ladle and dipper are both attached to spoon in the L D O C E , although "this is a dipper" entails and is not entailed by "this is a ladle". There is no way that dipper could be defined as a ladle (as, for instance, in the CED), since ladle is not in the defining vocabulary. As a result, hierarchies extracted from the LDOCE are consistently flat (figure 7).

3.1.2. Absent h y p e r n y m s : 0-3% In some cases, strategies likc that of Chodorow, Byrd and Hcidorn yield incorrect hypernyms, as in the following definitions: g r ill A grill is a part of a cooker... [COBUILD] c o r k s c r e w a pointed spiral piece of metal... [W9I d i n n e r service a ecm~plete set of plates and dishes... [LDOCE, not included in o u r corpus] The words part, piece, set, are clearly not hypernyms of the defined concepts: it is virtually meaningless to say that grill is a kind of part, or that corkscrew is a kind of piece. In these cases, the head of the noun phrase serves to mark another relation: part-whole, member-class, etc. It is easy to reject these and similar words (member, :series, etc.) as hypemyms, since they form a closed list (Kiavans, Chodorow, and Wacholder, 1990). However, excluding these words leaves us with no hypernym. We call these "absent hypernyms"; they occur in 0-3% of the definitions in our sample corpus (figure 8). The absence of a hypernym in a given definition text does not necessarily imply that no hypernym exists. For example, "this is a corkscrew" clearly entails (and is not entailed by) "this is a device" (the hypemym given by the COBUILD and the CED). In many eases, the lack of a hypernym seems to be the result of concern over space and/or readability. We can imagine, for example, that the definition for corkscrew could be more fully specified as "a device consisting of a pointed spiral piece of metal..." In such cases, lexicographers rely on the reader's ability to deduce that something made of metal, with a handle, used for pulling corks, can be called a device. However, for some terms, such as cutlery or dinner service, it is not clear that a hypernym exists. Note that we have voluntarily excluded problematic terms of this kind from our corpus, in order to restrict our evaluation to the best Case.

3.1.3. Missing overlaps : 8-14% Another problem results from the necessary choices that lexicographers must make in an attempt to specify a

whcrc spatula should appear (since wc have no

indication that it is not a conlainer), but at least it shows

that there may be some utensils which arc n o t

containers. Although this representation is more intuitively accurate than the representation in figure 5.b, ultimately it goes

  • too far in delineating the relations among terms. In actual use, the distinctions among terms are much less clear-cut than figure 6 implies, For instance, the figure indicates that all tools that are containers are also implements, but it is certainly not clear that humans would agree to this or use the terms in a manner consistent with this specification. Dictionaries

themselves do not agree, and when taken formally they yield very different diagrams for higher level concepts.

object container "

gl!ss b o w ~ e ~ l

plate tureen pressure, coffee- bottle pan cooker pot

frying-pan saucepan

container

F i g u r e 6. S o l v i n g " l o o p s " Figure 8 shows that 7-11% of the definitions use a hypcmym that is itself defined circularly.

utensil i n s t r u m e n t i m p l e m e n t

spatula spoon knife fork

I

ladle

dippe¢

glass bowl cup dish kettle pot coffee- teapot bottle p a n

pre~sure- cooker r,aucepan frying-pan dipper Figure 7. Hierarchies for the CED and LDOCE

plate t u r e e n

%

tool Made i n s t r u m e n t

A I I

spatula spoon knife fork

C O B UILD

3.3. S u m m a r y Altogether, the p r o b l e m s described in the sections above yield a 55-70% error rate in automatically extracted hierarchies. Given that we have attempted to consider the most favorable case, it appears that any single dictionary, taken in isolation, is a poor source of automatically extracted semanlic information. This is made more cvidcm in figure 7, which demonstrates the marked differences in hierarchies extracted from the

CED and LDOCE for a small subset of our corpus. A

summary of our results appears in figure 8.

COLliNS I.DOCE OALD W9 COMBINED Figure 8. (~uantitative evaluation

4. R E F I N I N G

We have concluded that hierarchies extracted using strategies such as that of Chodorow, Byrd, and Heidom are seriously flawed, and are therefore likely to be unusable in NLP systems. However, in this section we discuss various means to refine automatically extracted hierarchies, most of which can be pcrformcd automatically.

WORD COIIUILD C O L L I N S L D o c E ' O A L D W ladle spoon spoon spoon h a s i n container container container ewer jug jug OR pitcher container saucepan pot pan pot g r i l l (absent) devioe (absent) fork tool. implement instrument Figure 9. Mer

4.1. M e r g i n g d i c t i o n a r i e s It is possible to use information provided in the

differentiae of definition texts to refine hierarchies; for

example, in the definition vessel any object USI.:DAS a container... ICED]

the automatically extracted hypernym is object.

However, some additional processing of the definition

text enables the extraction of container following the

phrase "used as". It is also possible to use other

definitions. For example, the CED does not specify that

knife and spoon are implements, but this information is

provided in the definition of cutlery:

cutlery implements used for eating SUCII AS knives, forks, and spoons. ICED]

The extraction of information from differentiae

demands some extra parsing, which may be difficult for complex definitions. Also, further research is required to determine which phrases function as markers for which kind of information, and to determine how consistent their use is. More importantly, such information is sporadic, and its extraction may require more effort than the results warrant. We therefore seek more "brute force" methods to improve automatically ex tracted hierarchies. One of the most promising strategies for refining extracted information is the Use of information from several dictionaries. Hierarchies derived from individual dictionaries suffer from incompleteness, but it is extremely unlikely that the same information is consistently missing from all dictionaries. For instance,

the CED attaches cup to container, which is too high in

the hierarchy, while the W9 attaches it lower, to vessel.

It is therefore possible to use taxonomic information from several dictionaries to fill in absent hypemyms, missing links, and to rectify cases of too high attachment. To investigate this possibility, we merged the information extracted from the five English dictionaries in our database. The individual data for the five dictionaries was organized in a table, as in figure 9. Merging these hierarchies into a single hierarchy was accomplished automatically by applying a simple algorithm, which scans the table line-by-line, as follows:

  1. regard cells containing multiple heads conjoined

by or as null, since, as we saw in section 3.2.1, they

do not reliably provide a hypemym.

2) if all the cells agree (as for ladle), keep that term as

the hypernym. Otherwise: a) if a term is a hypernym of another term in the line, ignore it. b) take the remaining cell or cells as the hypernym(s).

This algorithm must be applied recursively, since, for example, it may not yet be known when evaluating

bct~in that container is a hypernym of vessel, and vessel

is a hypemym of bowl, until those terms are themselves

  • Combined spoon spoon spoon bowl vessel bowl pitcher pitcher OR jug ; pitcher pot ,, pan pot ANDpan device utensil device AND utensil implement implement tool, implement AND instrument ing hierarchies

processed. Therefore, several passes through the tab!e are required. Note that if after applying the algorithm several terms are left as hypernyms for a given word, we effectively create an overlap in the hierarchy. For

example, saucepen is attached to both pot and pan, and

fork is attached to tool, implement, and instrument.

We evaluate the quality of the resulting combined hierarchy using the same strategy applied in section 3. It is interesting to note that in the merged hierarchy, all the absent hypernym problems (including absence due to or-heads) have been eliminated, since in every case at least one of the five dictionaries gives a valid hypemym. In addition, almost all of the attachments too high in the hierarchy and missing overlaps have disappeared, although a few cases remain (5% and 1%, respectively). None of the dictionaries, for instance,

gives pot as the hypemym of teapot, although three of

the five dictionaries give pot as the hypernym of

coffeepot. A larger dictionary database would enable

the elimination of many of these remaining

imperfections (for example, New Penguin English

Dictionary, not included in our database, gives pot as a

hypemym of teapot).

Merging dictionaries on a large scale assumes that it is possible to automatically map senses across them. For our small sample, we mapped senses among dictionaries by hand. We describe elsewhere a promising method to automatically accomplish sense mapping, using a spreading activation algorithm (lde and Vtronis, 1990).

4.2. C o v e r t c a t e g o r i e s There remain a number of circularly-defined hypemyms in the combined taxonomy, which demand additional consideration on theoretical grounds. Circularly-def'med terms tend to appear when lexicographers lack terms to designate certain concepts. The fact that "it is not impossible for what is intuitively recognized as a conceptual category to be without a label" has already been noted (Cruse, 1986, p. 147). The lack of a specific term for a recognizable concept tends to occur more frequently at the higher levels of the hierarchy (and at the very lowest and most specific levels as well--e.g., there is no term to designate forks with two prongs). This is probably because any language

includes the most terms at the generic level (Brown,

1958), that is, the level of everyday, ordinary terms for

objects and living things (dog, pencil, house, etc.).

Circularity, as well as the use of or-conjoined terms at the high levels of the hierarchy, results largely from the lexicographers' efforts to approximate the terms they lack. For example, there is no clear term to denote that category of objects which fall under any of the terms

utensil, tool, implement, instrument, although this

concept seems to exist. Clearly, these terms are not

strictly synonymous--there are, for example, utensils

that one would not call tools (e.g., a colander). If a

term, let us say X, for the concept existed, then the

definitions for utensil, tool, implement, and instrument