




















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A position paper by john mccarthy from stanford university discussing the relationship between artificial intelligence (ai), mathematical logic, and the formalization of common-sense knowledge and reasoning. The paper explores the use of logic in ai, the challenges of formalizing common-sense knowledge, and the development of non-monotonic reasoning. It also touches upon the role of philosophy in ai and the potential applications of formalized context in ai systems.
What you will learn
Typology: Lecture notes
1 / 28
This page cannot be seen from the preview
Don't miss anything!
This is a position paper about the relations among artificial intelligence (AI), mathematical logic and the formalization of common-sense knowledge and reasoning. It also treats other problems of concern to both AI and philosophy. I thank the editor for inviting it. The position advocated is that philosophy can contribute to AI if it treats some of its traditional subject matter in more detail and that this will advance the philosophical goals also. Actual formalisms (mostly first order languages) for expressing common-sense facts are described in the references. Common-sense knowledge includes the basic facts about events (including actions) and their effects, facts about knowledge and how it is obtained, facts about beliefs and desires. It also includes the basic facts about material objects and their properties. One path to human-level AI uses mathematical logic to formalize common- sense knowledge in such a way that common-sense problems can be solved by logical reasoning. This methodology requires understanding the common- sense world well enough to formalize facts about it and ways of achieving goals in it. Basing AI on understanding the common-sense world is different
from basing it on understanding human psychology or neurophysiology. This approach to AI, based on logic and computer science, is complementary to approaches that start from the fact that humans exhibit intelligence, and that explore human psychology or human neurophysiology. This article discusses the problems and difficulties, the results so far, and some improvements in logic and logical languages that may be required to formalize common sense. Fundamental conceptual advances are almost cer- tainly required. The object of the paper is to get more help for AI from philosophical logicians. Some of the requested help will be mostly philosoph- ical and some will be logical. Likewise the concrete AI approach may fertilize philosophical logic as physics has repeatedly fertilized mathematics. There are three reasons for AI to emphasize common-sense knowledge rather than the knowledge contained in scientific theories. (1) Scientific theories represent compartmentalized knowledge. In pre- senting a scientific theory, as well as in developing it, there is a common-sense pre-scientific stage. In this stage, it is decided or just taken for granted what phenomena are to be covered and what is the relation between certain formal terms of the theory and the common-sense world. Thus in classical mechan- ics it is decided what kinds of bodies and forces are to be used before the differential equations are written down. In probabilistic theories, the sample space is determined. In theories expressed in first order logic, the predicate and function symbols are decided upon. The axiomatic reasoning techniques used in mathematical and logical theories depend on this having been done. However, a robot or computer program with human-level intelligence will have to do this for itself. To use science, common sense is required. Once developed, a scientific theory remains imbedded in common sense. To apply the theory to a specific problem, common-sense descriptions must be matched to the terms of the theory. For example, d = 12 gt^2 does not in itself identify d as the distance a body falls in time t and identify g as the acceleration due to gravity. (McCarthy and Hayes 1969) uses the situation calculus discussed in that paper to imbed the above formula in a formula describing the common-sense situation, for example
dropped(x, s) ∧ height(x, s) = h ∧ d = 12 gt^2 ∧ d < h ⊃ ∃s′(F (s, s′) ∧ time(s′) = time(s) + t ∧ height(x, s′) = h − d).
Here x is the falling body, and we are presuming a language in which
arisen, what has been done and the problems that can be foreseen. These problems are often more interesting than the ones suggested by philosophers trying to show the futility of formalizing common sense, and they suggest productive research programs for both AI and philosophy. In so far as the arguments against the formalizability of common-sense attempt to make precise intuitions of their authors, they can be helpful in identifying problems that have to be solved. For example, Hubert Dreyfus (1972) said that computers couldn’t have “ambiguity tolerance” but didn’t offer much explanation of the concept. With the development of nonmono- tonic reasoning, it became possible to define some forms of ambiguity toler- ance and show how they can and must be incorporated in computer systems. For example, it is possible to make a system that doesn’t know about possi- ble de re/de dicto ambiguities and has a default assumption that amounts to saying that a reference holds both de re and de dicto. When this assumption leads to inconsistency, the ambiguity can be discovered and treated, usually by splitting a concept into two or more. If a computer is to store facts about the world and reason with them, it needs a precise language, and the program has to embody a precise idea of what reasoning is allowed, i.e. of how new formulas may be derived from old. Therefore, it was natural to try to use mathematical logical languages to express what an intelligent computer program knows that is relevant to the problems we want it to solve and to make the program use logical inference in order to decide what to do. (McCarthy 1959) contains the first proposals to use logic in AI for expressing what a program knows and how it should reason. (Proving logical formulas as a domain for AI had already been studied by several authors). The 1959 paper said:
The advice taker is a proposed program for solving problems by manipulating sentences in formal languages. The main differ- ence between it and other programs or proposed programs for ma- nipulating formal languages (the Logic Theory Machine of Newell, Simon and Shaw and the Geometry Program of Gelernter) is that in the previous programs the formal system was the subject mat- ter but the heuristics were all embodied in the program. In this program the procedures will be described as much as possible in the language itself and, in particular, the heuristics are all so described.
The main advantages we expect the advice taker to have is that its behavior will be improvable merely by making state- ments to it, telling it about its symbolic environment and what is wanted from it. To make these statements will require little if any knowledge of the program or the previous knowledge of the advice taker. One will be able to assume that the advice taker will have available to it a fairly wide class of immediate logical consequences of anything it is told and its previous knowledge. This property is expected to have much in common with what makes us describe certain humans as having common sense. We shall therefore say that a program has common sense if it auto- matically deduces for itself a sufficiently wide class of immediate consequences of anything it is told and what it already knows.
The main reasons for using logical sentences extensively in AI are better understood by researchers today than in 1959. Expressing information in declarative sentences is far more modular than expressing it in segments of computer program or in tables. Sentences can be true in much wider contexts than specific programs can be useful. The supplier of a fact does not have to understand much about how the receiver functions, or how or whether the receiver will use it. The same fact can be used for many purposes, because the logical consequences of collections of facts can be available. The advice taker prospectus was ambitious in 1959, would be considered ambitious today and is still far from being immediately realizable. This is especially true of the goal of expressing the heuristics guiding the search for a way to achieve the goal in the language itself. The rest of this paper is largely concerned with describing what progress has been made, what the obstacles are, and how the prospectus has been modified in the light of what has been discovered. The formalisms of logic have been used to differing extents in AI. Most of the uses are much less ambitious than the proposals of (McCarthy 1959). We can distinguish four levels of use of logic.
as promised in the above extract from the 1959 paper.
predicate symbols have obvious meanings. not(P) :- P, !, fail. not(P). sterile(X) :- not(nonsterile(X)). nonsterile(X) :- bacterium(Y), in(Y,X), not(dead(Y)). hot(Y) :- in(Y,X), hot(X). dead(Y) :- bacterium(Y), hot(Y). bacterium(b1). bacterium(b2). bacterium(b3). bacterium(b4). in(b1,c1). in(b2,c1). in(b3,c2). in(b4,c2). hot(c1). Giving Prolog the goal sterile(c1) and sterile(c2) gives the answers yes and no respectively. However, Prolog has indexed over the bacteria in the containers. The following is a Prolog program that can verify whether a sequence of actions, actually just heating it, will sterilize a container. It involves introducing situations analogous to those discussed in (McCarthy and Hayes 1969).
not(P) :- P, !, fail. not(P). sterile(X,S) :- not(nonsterile(X,S)). nonsterile(X,S) :- bacterium(Y), in(Y,X), not(dead(Y,S)). hot(Y,S) :- in(Y,X), hot(X,S). dead(Y,S) :- bacterium(Y), hot(Y,S). bacterium(b1). bacterium(b2). bacterium(b3). bacterium(b4). in(b1,c1). in(b2,c1). in(b3,c2). in(b4,c2).
informatic situation is complex. Here is a preliminary list of features and considerations.
require detailed information about the clerk’s psychology, and anyway this information is not ordinarily available. The following sections deal mainly with the advances we see as required to achieve the fourth level of use of logic in AI.
2 Formalized Nonmonotonic Reasoning
It seems that fourth level systems require extensions to mathematical logic. One kind of extension is formalized nonmonotonic reasoning, first proposed in the late 1970s (McCarthy 1977, 1980, 1986), (Reiter 1980), (McDermott and Doyle 1980), (Lifschitz 1989a). Mathematical logic has been monotonic in the following sense. If we have A p and A ⊂ B, then we also have B
p. If the inference is logical deduction, then exactly the same proof that proves p from A will serve as a proof from B. If the inference is model- theoretic, i.e. p is true in all models of A, then p will be true in all models of B, because the models of B will be a subset of the models of A. So we see that the monotonic character of traditional logic doesn’t depend on the details of the logical system but is quite fundamental. While much human reasoning is monotonic, some important human common- sense reasoning is not. We reach conclusions from certain premisses that we would not reach if certain other sentences were included in our premisses. For example, if I hire you to build me a bird cage, you conclude that it is appropriate to put a top on it, but when you learn the further fact that my bird is a penguin you no longer draw that conclusion. Some people think it is possible to try to save monotonicity by saying that what was in your mind was not a general rule about birds flying but a probabilistic rule. So far these people have not worked out any detailed epistemology for this ap- proach, i.e. exactly what probabilistic sentences should be used. Instead AI has moved to directly formalizing nonmonotonic logical reasoning. Indeed it seems to me that when probabilistic reasoning (and not just the axiomatic basis of probability theory) has been fully formalized, it will be formally nonmonotonic. Nonmonotonic reasoning is an active field of study. Progress is often driven by examples, e.g. the Yale shooting problem (Hanks and McDer- mott 1986), in which obvious axiomatizations used with the available rea- soning formalisms don’t seem to give the answers intuition suggests. One direction being explored (Moore 1985, Gelfond 1987, Lifschitz 1989a) in-
compatible with the facts being taken into account. This has the effect that a bird will be considered to fly unless other axioms imply that it is abnormal in aspect2. (2) is called a cancellation of inheritance axiom, because it explicitly cancels the general presumption that objects don’t fly. This approach works fine when the inheritance hierarchy is given explicitly. More elaborate approaches, some of which are introduced in (McCarthy 1986) and improved in (Haugh 1988), are required when hierarchies with indefinite numbers of sorts are considered.
(∀pes)(holds(p, s) ∧ ¬ab(aspect1(p, e, s)) ⊃ holds(p, result(e, s))),
asserts that a fact p that holds in a situation s is presumed to hold in the situation result(e, s) that results from an event e unless there is evidence to the contrary. Unfortunately, Lifschitz (1985 personal communication) and Hanks and McDermott (1986) showed that simple treatments of the common-sense law of inertia admit unintended models. Several authors have given more elaborate treatments, but in my opinion, the results are not yet entirely satisfactory. The best treatment so far seems to be that of (Lifschitz 1987).
4 Ability, Practical Reason and Free Will
An AI system capable of achieving goals in the common-sense world will have to reason about what it and other actors can and cannot do. For concreteness, consider a robot that must act in the same world as people and perform tasks that people give it. Its need to reason about its abilities puts the traditional philosophical problem of free will in the following form. What view shall we build into the robot about its own abilities, i.e. how shall we make it reason about what it can and cannot do? (Wishing to avoid begging any questions, by reason we mean compute using axioms, observation sentences, rules of inference and nonmonotonic rules of conjecture.)
Let A be a task we want the robot to perform, and let B and C be alternate intermediate goals either of which would allow the accomplishment of A. We want the robot to be able to choose between attempting B and attempting C. It would be silly to program it to reason: “I’m a robot and a deterministic device. Therefore, I have no choice between B and C. What I will do is determined by my construction.” Instead it must decide in some way which of B and C it can accomplish. It should be able to conclude in some cases that it can accomplish B and not C, and therefore it should take B as a subgoal on the way to achieving A. In other cases it should conclude that it can accomplish either B or C and should choose whichever is evaluated as better according to the criteria we provide it. (McCarthy and Hayes 1969) proposes conditions on the semantics of any formalism within which the robot should reason. The essential idea is that what the robot can do is determined by the place the robot occupies in the world—not by its internal structure. For example, if a certain sequence of outputs from the robot will achieve B, then we conclude or it concludes that the robot can achieve B without reasoning about whether the robot will actually produce that sequence of outputs. Our contention is that this is approximately how any system, whether human or robot, must reason about its ability to achieve goals. The basic formalism will be the same, regardless of whether the system is reasoning about its own abilities or about those of other systems including people. The above-mentioned paper also discusses the complexities that come up when a strategy is required to achieve the goal and when internal inhibitions or lack of knowledge have to be taken into account.
5 Three Approaches to Knowledge and Belief
Our robot will also have to reason about its own knowledge and that of other robots and people. This section contrasts the approaches to knowledge and belief character- istic of philosophy, philosophical logic and artificial intelligence. Knowledge and belief have long been studied in epistemology, philosophy of mind and in philosophical logic. Since about 1960, knowledge and belief have also been studied in AI. (Halpern 1986) and (Vardi 1988) contain recent work, mostly oriented to computer science including AI. It seems to me that philosophers have generally treated knowledge and
can be absolutely certain. Its notion of knowledge doesn’t have to be com- plete; i.e. it doesn’t have to determine in all cases whether a person is to be regarded as knowing a given proposition. For many tasks it doesn’t have to have opinions about when true belief doesn’t constitute knowledge. The designers of AI systems can try to evade philosophical puzzles rather than solve them. Maybe some people would suppose that if the question of certainty is avoided, the problems formalizing knowledge and belief become straightfor- ward. That has not been our experience. As soon as we try to formalize the simplest puzzles involving knowledge, we encounter difficulties that philosophers have rarely if ever attacked. Consider the following puzzle of Mr. S and Mr. P. Two numbers m and n are chosen such that 2 ≤ m ≤ n ≤ 99. Mr. S is told their sum and Mr. P is told their product. The following dialogue ensues:
Mr. P: I don’t know the numbers. Mr. S: I knew you didn’t know them. I don’t know them either. Mr. P: Now I know the numbers. Mr. S: Now I know them too.
In view of the above dialogue, what are the numbers?
Formalizing the puzzle is discussed in (McCarthy 1989). For the present we mention only the following aspects.
The first order language used to express the facts of this problem involves an accessibility relation A(w 1 , w 2 , p, t), modeled on Kripke’s semantics for modal logic. However, the accessibility relation here is in the language itself rather than in a metalanguage. Here w1 and w2 are possible worlds, p is a person and t is an integer time. The use of possible worlds makes it convenient to express non-knowledge. Assertions of non-knowledge are expressed as the existence of accessible worlds satisfying appropriate conditions. The problem was successfully expressed in the language in the sense that an arithmetic condition determining the values of the two numbers can be de- duced from the statement. However, this is not good enough for AI. Namely, we would like to include facts about knowledge in a general purpose common- sense database. Instead of an ad hoc formalization of Mr. S and Mr. P, the problem should be solvable from the same general facts about knowledge that might be used to reason about the knowledge possessed by travel agents supplemented only by the facts about the dialogue. Moreover, the language of the general purpose database should accommodate all the modalities that might be wanted and not just knowledge. This suggests using ordinary logic, e.g. first order logic, rather than modal logic, so that the modalities can be ordinary functions or predicates rather than modal operators. Suppose we are successful in developing a “knowledge formalism” for our common-sense database that enables the program controlling a robot to solve puzzles and plan trips and do the other tasks that arise in the common-sense environment requiring reasoning about knowledge. It will surely be asked whether it is really knowledge that has been formalized. I doubt that the question has an answer. This is perhaps the question of whether knowledge is a natural kind. I suppose some philosophers would say that such problems are not of philosophical interest. It would be unfortunate, however, if philosophers were to abandon such a substantial part of epistemology to computer science. This is because the analytic skills that philosophers have acquired are relevant to the problems.
6 Reifying Context
We propose the formula holds(p, c) to assert that the proposition p holds in context c. It expresses explicitly how the truth of an assertion depends on context. The relation c 1 ≤ c2 asserts that the context c2 is more general
In our formal language c17 has to carry the information about who he is, which car and when. Now suppose that the same fact is to be conveyed as in example 1, but the context is a certain Stanford Computer Science Department 1980s context. Thus familiarity with cars is presupposed, but no particular person, car or occasion is presupposed. The meanings of certain names is presupposed, however. We can call that context (say) c5. This more general context requires a more explicit proposition; thus, we would have
holds(at(“Timothy McCarthy”, inside((ιx)(iscar(x) ∧ ∧ belongs(x, “John McCarthy”)))), c5).
A yet more general context might not identify a specific John McCarthy, so that even this more explicit sentence would need more information. What would constitute an adequate identification might also be context dependent. Here are some of the properties formalized contexts might have.
(∀c 1 c 2 p)(c 1 ≤ c2) ∧ holds(p, c1) ∧ ¬ab1(p, c 1 , c2) ⊃ holds(p, c2) (4)
and
(∀c 1 c 2 p)(c 1 ≤ c2) ∧ holds(p, c2) ∧ ¬ab2(p, c 1 , c2) ⊃ holds(p, c1). (5)
Thus there is nonmonotonic inheritance both up and down in the generality hierarchy.
c19 = specialize(he = Timothy McCarthy, belongs(car, John McCarthy), c5). (6) We will have c 19 ≤ c5.
properties of people and cars are factual, e.g. it is presumed that people fit into cars.
meaning(he, c17) = meaning(“Timothy McCarthy”, c5).
holds(Holds1(at(I, airport), result(drive-to(airport, result(walk-to(car), S0))), c1).
This can be interpreted as asserting that under the assumptions embodied in context c1, a plan of walking to the car and then driving to the airport would get the robot to the airport starting in situation S0.
enter c.
This enables us to write p instead of holds(p, c). If we subsequently infer q, we can replace it by holds(q, c) and leave the context c. Then holds(q, c) will itself hold in the outer context in which holds(p, c) holds. When a context is entered, there need to be restrictions analogous to those that apply in natural deduction when an assumption is made. One way in which this notion of entering and leaving contexts is more general than natural deduction is that formulas like holds(p, c1) and (say) holds(notp, c2) behave differently from c 1 ⊃ p and c 2 ⊃ ¬p which are their natural deduction analogs. For example, if c1 is associated with the time 5pm