Preface - 5th Edition
What we have to learn to do we learn by doing. . .
Welcome to the Fifth Edition!
I was very pleased to be asked to produce a fifth edition of our artificial intelligence book. It is a compliment to the earlier editions, started almost twenty years ago, that our approach to AI has been so highly valued. It is also exciting that, as new development in the field emerges, we are able to present much of it in each new edition. We thank our readers, colleagues, and students for keeping our topics relevant and presentation up to date.
Many sections of the earlier editions have endured remarkably well, including the presentation of logic, search algorithms, knowledge representation, production systems, machine learning, and the programming techniques developed in LISP and PROLOG. These remain central to the practice of artificial intelligence, and required a relatively small effort to bring them up to date. We created a new introductory chapter on stochastic methods for the fifth edition. We feel that the stochastic technology is having an increasingly larger impact on AI, especially in areas such as diagnostic reasoning, natural language analysis, and learning. To support these technologies we expand the presentation of Bayes' theorem, Bayesian networks and related graphical models, probabilistic finite state machines, dynamic programming with the Viterbi algorithm, and Markov modeling. Other topics, such as emergent computation, case-based reasoning, and model-based problem solving, that were treated cursorily in the first editions, have grown sufficiently in importance to merit a more complete discussion. The changes in the fifth edition reflect the evolving research questions and are evidence of the continued vitality of the field of artificial intelligence.
As the scope of our AI project grew, we have been sustained by the support of our publisher, editors, friends, colleagues, and, most of all, by our readers, who have given our work such a long and productive life. We remain excited at the writing opportunity we are afforded: Scientists are rarely encouraged to look up from their own, narrow research interests and chart the larger trajectories of their chosen field. Our readers have asked us to do just that. We are grateful to them for this opportunity. We are also encouraged that our earlier editions have been used in AI communities worldwide and translated into a number of languages including German, Polish, Portuguese, Russian, and two dialects of Chinese!
Although artificial intelligence, like most engineering disciplines, must justify itself to the world of commerce by providing solutions to practical problems, we entered the field of AI for the same reasons as many of our colleagues and students: we want to understand and explore the mechanisms of mind that enable intelligent thought and action. We reject the rather provincial notion that intelligence is an exclusive ability of humans, and believe that we can effectively investigate the space of possible intelligences by designing and evaluating intelligent artifacts. Although the course of our careers has given us no cause to change these commitments, we have arrived at a greater appreciation for the scope, complexity, and audacity of this undertaking. In the preface to our earlier editions, we outlined three assertions that we believed distinguished our approach to teaching artificial intelligence. It is reasonable, in writing a preface to this fifth edition, to return to these themes and see how they have endured as our field has grown.
The first of these goals was to "unify the diverse branches of AI through a detailed discussion of its theoretical foundations". At the time we adopted that goal, it seemed that the main problem was reconciling researchers who emphasized the careful statement and analysis of formal theories of intelligence (the neats) with those who believed that intelligence itself was some sort of grand hack that could be best approached in an application-driven, ad hoc manner (the scruffies). That dichotomy has proven far too simple. In contemporary AI, debates between neats and scruffies have given way to dozens of other debates between proponents of physical symbol systems and students of neural networks, between logicians and designers of artificial life forms that evolve in a most illogical manner, between architects of expert systems and case-based reasoners, and finally, between those who believe artificial intelligence has already been achieved and those who believe it will never happen. Our original image of AI as frontier science where outlaws, prospectors, wild-eyed prairie prophets and other dreamers were being slowly tamed by the disciplines of formalism and empiricism has given way to a different metaphor: that of a large, chaotic but mostly peaceful city, where orderly bourgeois neighborhoods draw their vitality from diverse, chaotic, bohemian districts. Over the years that we have devoted to the different editions of this book, a compelling picture of the architecture of intelligence has started to emerge from this city's structure, art, and industry.
Intelligence is too complex to be described by any single theory; instead, researchers are constructing a hierarchy of theories that characterize it at multiple levels of abstraction. At the lowest levels of this hierarchy, neural networks, genetic algorithms and other forms of emergent computation have enabled us to understand the processes of adaptation, perception, embodiment, and interaction with the physical world that must underlie any form of intelligent activity. Through some still partially understood resolution, this chaotic population of blind and primitive actors gives rise to the cooler patterns of logical inference. Working at this higher level, logicians have built on Aristotle's gift, tracing the outlines of deduction, abduction, induction, truth-maintenance, and countless other modes and manners of reason. At even higher levels of abstraction, designers of expert systems, intelligent agents, and natural language understanding programs have come to recognize the role of social processes in creating, transmitting, and sustaining knowledge. Finally, as philosophers we are charged to critique the epistemological validity of the AI enterprise. For this task we discuss the rationalist project, the empiricists dilemma, and propose a constructivist rapprochement. In this fifth edition, we have touched on all these levels of the developing AI endeavour.
The second commitment we made in the earlier editions was to the central position of "advanced representational formalisms and search techniques" in AI methodology. This is, perhaps, the most controversial aspect of our previous editions and of much early work in AI, with many researchers in emergent computation questioning whether symbolic reasoning and referential semantics have any role at all in thought. Although the idea of representation as giving names to things has been challenged by the implicit representation provided by the emerging patterns of a neural network or an artificial life, we believe that an understanding of representation and search remains essential to any serious practitioner of artificial intelligence. We also feel that an overview of the historical traditions and the skills acquired through the study of representation and search are critical components for AI education. Furthermore, these are invaluable tools for analyzing such aspects of non-symbolic AI as the expressive power of a neural network or the progression of candidate problem solutions through the fitness landscape of a genetic algorithm. Comparisons, contrasts, and a critique of the various approaches of modern AI are offered in Chapter 17.
The third commitment we made at the beginning of this book's life cycle, to "place artificial intelligence within the context of empirical science," has remained unchanged. To quote from the preface to the third edition, we continue to believe that AI is not . . . some strange aberration from the scientific tradition, but . . . part of a general quest for knowledge about, and the understanding of intelligence itself. Furthermore, our AI programming tools, along with the exploratory programming methodology . . . are ideal for exploring an environment. Our tools give us a medium for both understanding and questions. We come to appreciate and know phenomena constructively, that is, by progressive approximation.
Thus we see each design and program as an experiment with nature: we propose a representation, we generate a search algorithm, and then we question the adequacy of our characterization to account for part of the phenomenon of intelligence. And the natural world gives a response to our query. Our experiment can be deconstructed, revised, extended, and run again. Our model can be refined, our understanding extended.
New with The Fifth Edition
The most general change for the fifth edition was the extension of the material related to the stochastic approaches to AI. To accomplish we added a completely new Chapter 5 that introduces the stochastic methodology. From the basic foundations of set theory and counting we develop the notions of probabilities, random variables, and independence. We present and use Bayes' theorem first with one symptom and one disease and then in its full general form. We examine the hypotheses that underlie the use of Bayes and then present the argmax and naive Bayes approaches. We present examples of stochastic reasoning, including several from work in the analysis of language phenomena. We also introduce the idea of conditional independence that leads to our presentation of Bayesian belief networks (BBNs) and d-separation in Chapter 9.
We supplemented the presentation of materials throughout the book with more on these stochastic methods. This included introducing probabilistic finite state machines and probabilistic acceptors, and the presentation of algorithms for dynamic programming, especially using stochastic measures (sometimes referred to as the Viterbi algorithm). We extended the material in Chapter 9 (reasoning under conditions of uncertainty) to include Bayesian belief nets, hidden Markov, and other graphical models. We added a stochastic English language parser (based on the work of Mark Steedman at the University of Edinburgh) to Chapter 15, the PROLOG material in the book.
We also extended material in several sections of the book to recognize the continuing importance of agent-based problem solving and embodiment in AI technology. In discussions of the foundations of AI we recognize intelligence as physically embodied and situated in a natural and social world context. Apropos of this, we present in Chapter 7 the evolution of AI representational schemes from associative and early logic-based, through weak and strong method approaches, including connectionist and evolutionary/emergent models, to the situated and social aspects of created intelligence. Chapter 17 contains a critique of each of these representational approaches.
Chapter 14 presents issues in natural language understanding, including a section on stochastic models for language comprehension. The presentation includes Markov models, CART trees, mutual information clustering, and statistic-based parsing. The chapter closes with several examples, including the applications of text mining and text summarization techniques for the WWW.
Finally, in a revised Chapter 17, we return to the deeper questions of the nature of intelligence and the possibility of intelligent machines. We comment on the AI endeavour from the perspectives of philosophy, psychology, and neuro-physiology.
The Contents Chapter 1 (Part I) introduces artificial intelligence. We begin with a brief history of attempts to understand mind and intelligence in philosophy, psychology, and other related areas. In an important sense, AI is an old science, tracing its roots back at least to Aristotle. An appreciation of this background is essential for an understanding of the issues addressed in modern research. We also present an overview of some of the important applications of AI. Our goal in Chapter 1 is to provide both background and a motivation for the theory and applications that follow.
Chapters 2, 3, 4, 5, and 6 (Part II) introduce the research tools for AI problem solving. These include, in Chapter 2, the predicate calculus presented both as a mathematical system as well as a representation language to describe the essential features of a problem. Search, and the algorithms and data structures used to implement search, are introduced in Chapter 3, to organize the exploration of problem situations. In Chapter 4, we discuss the essential role of heuristics in focusing and constraining search-based problem solving. In Chapter 5, we introduce the stochastic methodology, important technology for reasoning in situations of uncertainty. In Chapter 6, we present a number of software architectures, including the blackboard and production system, for implementing these search algorithms.
Chapters 7, 8, and 9 make up Part III: representations for AI, knowledge-intensive problem solving, and reasoning in changing and ambiguous situations. In Chapter 7 we present the evolving story of AI representational schemes. We begin with a discussion of association-based networks and extend this model to include conceptual dependency theory, frames, and scripts. We then present an in-depth examination of a particular formalism, conceptual graphs, emphasizing the epistemological issues involved in representing knowledge and showing how these issues are addressed in a modern representation language. Expanding on this formalism in Chapter 14, we show how conceptual graphs can be used to implement a natural language database front end. We conclude Chapter 7 with more modern approaches to representation, including Copycat and agent-oriented architectures.
Chapter 8 presents the rule-based expert system along with case-based and model-based reasoning, including examples from the NASA space program. These approaches to problem solving are presented as a natural evolution of the material in Part II: using a production system of predicate calculus expressions to orchestrate a graph search. We end with an analysis of the strengths and weaknesses of each of these approaches to knowledge-intensive problem solving.
Chapter 9 presents models for reasoning with uncertainty as well as the use of unreliable information. We introduce Bayesian models, belief networks, Dempster-Shafer, causal models, and the Stanford certainty algebra for reasoning in uncertain situations. Chapter 9 also contains algorithms for truth maintenance, reasoning with minimum models, logic-based abduction, and the clique-tree algorithm for Bayesian belief networks.
Part IV, Chapters 10 through 12, is an extensive presentation of issues in machine learning. In Chapter 10 we offer a detailed look at algorithms for symbol-based learning, a fruitful area of research spawning a number of different problems and solution approaches. These learning algorithms vary in their goals, the training data considered, their learning strategies, and the knowledge representations they employ. Symbol-based learning includes induction, concept learning, version-space search, and ID3. The role of inductive bias is considered, generalizations from patterns of data, as well as the effective use of knowledge to learn from a single example in explanation-based learning. Category learning, or conceptual clustering, is presented with unsupervised learning. Reinforcement learning, or the ability to integrate feedback from the environment into a policy for making new decisions concludes the chapter.
In Chapter 11 we present neural networks, often referred to as sub-symbolic or connectionist models of learning. In a neural net, information is implicit in the organization and weights on a set of connected processors, and learning involves a re-arrangement and modification of the overall weighting of nodes and structure of the system. We present a number of connectionist architectures, including perceptron learning, backpropagation, and counterpropagation. We demonstrate Kohonen, Grossberg, and Hebbian models. We present associative learning and attractor models, including Hopfield networks.
Genetic algorithms and evolutionary approaches to learning are introduced in Chapter 12. On this viewpoint, learning is cast as an emerging and adaptive process. After several examples of problem solutions based on genetic algorithms, we introduce the application of genetic techniques to more general problem solvers. These include classifier systems and genetic programming. We then describe society-based learning with examples from artificial life, called a-life, research. We conclude the chapter with an example of emergent computation from research at the Santa Fe Institute. We compare, contrast, and critique the three approaches we present to machine learning (symbol-based, connectionist, social and emergent) in Chapter 17.
Part V, Chapters 13 and 14, presents automated reasoning and natural language understanding. Theorem proving, often referred to as automated reasoning, is one of the oldest areas of AI research. In Chapter 13, we discuss the first programs in this area, including the Logic Theorist and the General Problem Solver. The primary focus of the chapter is binary resolution proof procedures, especially resolution refutations. More advanced inferencing with hyper-resolution and paramodulation is also presented. Finally, we describe the PROLOG interpreter as a Horn clause and resolution-based inferencing system, and see PROLOG computing as an instance of the logic programming paradigm.
Chapter 14 presents natural language understanding. Our traditional approach to language understanding, exemplified by many of the semantic structures presented in Chapter 7, is complemented with the stochastic approach. These include Markov models, CART trees, mutual information clustering, and statistics-based parsing. The chapter concludes with examples applying these natural language techniques to database query systems and also to a text summarization system for use on the WWW.
Part VI develops in LISP and PROLOG many of the algorithms presented in the earlier chapters. Chapter 15 covers PROLOG and Chapter 16 LISP. We demonstrate these languages as tools for AI problem solving by building the search and representation techniques of the earlier chapters, including breadth-, depth-, and best-first search algorithms. We implement these search techniques in a problem-independent fashion so that they may be extended to create shells for search in rule-based expert systems, to build semantic networks, natural language understanding systems, and learning applications.
Finally, Chapter 17 serves as an epilogue for the book. It addresses the issue of the possibility of a science of intelligent systems, and considers contemporary challenges to AI; it discusses AI's current limitations, and projects its exciting future.
Using This Book
Artificial intelligence is a big field, and consequently, this is a large book. Although it would require more than a single semester to cover all of the material offered, we have designed our book so that a number of paths may be taken through the material. By selecting subsets of the material, we have used this text for single semester and full year (two semester) courses.
We assume that most students will have had introductory courses in discrete mathematics, including predicate calculus, set theory, counting, and graph theory. If this is not true, the instructor should spend more time on these concepts in the optional sections at the beginning of the introductory chapters (2.1, 3.1, and 5.1). We also assume that students have had courses in data structures including trees, graphs, and recursion-based search, using stacks, queues, and priority queues. If they have not, then spend more time on the beginning sections of Chapters 3, 4, and 6.
In a one quarter or one semester course, we go quickly through the first two parts of the book. With this preparation, students are able to appreciate the material in Part III. We then consider the PROLOG and LISP in Part VI and require students to build many of the representation and search techniques of the first sections. Alternatively, one of the languages, PROLOG, for example, can be introduced early in the course and be used to test out the data structures and search techniques as that are encountered. We feel the meta-interpreters presented in the language chapters are very helpful for building rule-based and other knowledge-intensive problem solvers. PROLOG and LISP are both excellent tools for building natural language understanding and learning systems.
In a two-semester or three-quarter course, we are able to cover the application areas of Parts IV and V, especially the machine learning chapters, in appropriate detail. We also expect a much more detailed programming project from students. We think that it is very important in the second semester for students to revisit many of the primary sources in the AI literature. It is crucial for students to see both where we are, as well as how we got here, and to have an appreciation of the future promises of artificial intelligence. We use a collected set of readings for this purpose, such as, Computation and Intelligence (Luger 1995).
The algorithms of our book are described using a Pascal-like pseudo-code. This notation uses the control structures of Pascal along with English descriptions of the tests and operations. We have added two useful constructs to the Pascal control structures. The first is a modified case statement that, rather than comparing the value of a variable with constant case labels, as in standard Pascal, lets each item be labeled with an arbitrary boolean test. The case evaluates these tests in order until one of them is true and then performs the associated action; all other actions are ignored. Those familiar with LISP will note that this has the same semantics as the LISP cond statement.
The other addition to our pseudo code language is a return statement which takes one argument and can appear anywhere within a procedure or function. When the return is encountered, it causes the program to immediately exit the function, returning its argument as a result. Other than these modifications we used Pascal structure, with a reliance on the English descriptions, to make the algorithms clear.
Supplemental Material Available via the Internet
The fifth edition has an important attached web site maintained by Addison-Wesley Pearson. This site, built by two UNM graduate students, Alejandro CdeBaca and Cheng Liu, includes supplementary ideas for most chapters, some sample problems with their solutions, and many ideas for student projects. Besides the LISP and PROLOG materials in Chapters 15 and 16, we have included many AI algorithms in Java and C++ on the web site. Students are welcome to use these materials and supplement them with their own comments, code, and critiques.
The PROLOG and LISP code presented in the book is available to readers on the book web page just described as well as via the internet at www.cs.unm.edu/~luger/. Follow the pointers to the fifth edition
Addison-Wesley and Pearson Education: www.aw.com/cs/ and www.pearsoneduc.com/computing support this book with an Instructor's Guide, which has many of the book's exercises worked out, several practice tests with solutions, and ideas for teaching the material. They also have a full set of PowerPoint presentation materials for use by instructors. See your local A-W Pearson representative for access to these materials.
My e-mail address is firstname.lastname@example.org, and I very much enjoy hearing from my readers.
Although I am the sole author of the fifth edition, this book has always been the product of my efforts as Professor of Computer Science, Psychology, and Linguistics at the University of New Mexico together with the contributions of my fellow faculty, my professional colleagues, graduate students, and friends, especially the members of the UNM artificial intelligence community. The fifth edition is also the product of the many readers that have e-mailed me comments, corrections, and suggestions. The book will continue this way, reflecting this "community" effort; consequently, I will continue using the prepositions we, our, and us when presenting material.
I thank Bill Stubblefield, the co-author for the first three editions, for more than fifteen years of contributions, but even more importantly, for his friendship over the past twenty-five years. I also thank the many reviewers that have helped develop these five editions. These include Dennis Bahler, Leonardo Bottaci, Skona Brittain, Philip Chan, Peter Collingwood, Mehdi Dastani, John Donald, Sarah Douglas, Christophe Giraud-Carrier, Andrew Kosoresow, Terran Lane, Chris Malcolm, Ray Mooney, Marek Perkowski, Barak Pearmutter, Dan Pless, Bruce Porter, Julian Richardson, Jude Shavlik, John Sheppard, Carl Stern, Leon van der Torre, Marco Valtorta, and Bob Veroff. We also appreciate the numerous suggestions and comments sent directly by e-mail from people using the book. Finally, Chris Malcolm, Brendan McGonnigle, and Akasha Tang, critiqued Chapter 17.
From our UNM colleagues, we thank Dan Pless for his major role in developing material in Chapters 5 and 9, Joseph Lewis for his efforts on Chapters 9 and 17, Carl Stern for his help in developing Chapter 10 on connectionist learning, Bob Veroff for his critique of the automated reasoning material in Chapter 13, and Jared Saia for helping with the stochastic approaches to natural language understanding of Chapter 14. Alejandro CdeBaca and Cheng Liu checked the bibliographic references and did the indexing.
We thank Academic Press for permission to reprint much of the material of Chapter 11; this appeared in the book Cognitive Science: The Science of Intelligent Systems (Luger 1994). Finally, we thank more than a decade of students who have used this text and software at UNM for their help in expanding our horizons, as well as in removing typos and bugs from the book.
We thank our many friends at Benjamin-Cummings, Addison-Wesley-Longman, Pearson Education for their support and encouragement in completing the writing task of our five editions, especially Alan Apt in helping us with the first edition, Lisa Moller and Mary Tudor for their help on the second, Victoria Henderson, Louise Wilson, and Karen Mosman for their assistance on the third, Keith Mansfield, Karen Sutherland, and Anita Atkinson for support on the fourth, and Keith Mansfield, Owen Knight, Mary Lince, and Bridget Allen for their help on this fifth edition. Katherine Haratunian of Addison Wesley USA has had a huge role in seeing that Professors received their Instructor Guides and PowerPoint presentation materials. These are maintained by Addison-Wesley Pearson and available only through your local sales representative. We appreciate Linda Cicarella's work at the University of New Mexico helping prepare figures for publication.
Individual debts for permissions to include copyright materials in all five editions are listed in the Publisher's Acknowledgments section which follows this preface.
We thank Thomas Barrow, internationally recognized artist and University of New Mexico Professor of Art (emeritus), who created the seven photograms for this book.
Artificial intelligence is an exciting and rewarding discipline; may you enjoy your study as you come to appreciate its power and challenges.
1 July 2004