Papers on Planning

Publications

Papers on EXPECT

Papers on Planning

Papers on Knowledge Acquisition

Papers on Intelligent Agents

Papers on Ontologies and Problem-Solving Methods

Papers on Proactive Dialogue

Papers on Machine Learning

Papers on Machine Learning

Jihie Kim and Paul Rosenbloom. "Bounding the Cost of Learned Rules". Artificial Intelligence, Vol. 120, No.1, 2000. (PDF file )

Abstract: In this article we approach one key aspect of the utility problem in explanation-based learning (EBL) --- the expensive-rule problem --- as an avoidable defect in the learning procedure. In particular, we examine the relationship between the cost of solving a problem without learning versus the cost of using a learned rule to provide the same solution, and refer to a learned rule as expensive if its use is more costly than the original problem solving from which it was learned. The key idea we explore is that expensiveness is inadvertently and unnecessarily introduced into learned rules by the learning algorithms themselves. This becomes a particularly powerful idea when combined with an analysis tool which identifies these hidden sources of expensiveness, and modifications of the learning algorithms which eliminate them. The result is learning algorithms for which the cost of learned rules is bounded by the cost of the problem solving that they replace. We investigate this idea through an analysis of EBLsoar, an implementation of explanation-based learning within the Soar architecture. A transformational analysis is used to identify where EBLsoar inadvertently introduces substantial additional costs in the process of converting a problem solving episode into a learned rule --- excessive costs which all ultimately turn out to stem from losses of information during learning. Based on these results, a modified EBLsoar algorithm --- Bounded EBLsoar (BEBLsoar) --- is developed from which all sources of expensiveness have been eliminated. The cost of using a rule learned by BEBLsoar is provably bounded by the cost of the problem solving it replaces.

Jim Blythe and Manuela Veloso. "Learning to Improve Uncertainty Handling in a Hybrid Planning System", Proceedings of the AAAI Fall Symposium on Learning Complex Behaviors in Intelligent Adaptive Systems, 1996 ( postscript file)

Abstract: Weaver is a hybrid planning algorithm that can create plans in domains that include uncertainty, modelled either as incomplete knowledge of the initial state of the world, of the effects of plan steps or of the possible external events. The plans are guaranteed to exceed some given threshold probability of success. Weaver creates a Bayesian network representation of a plan to evaluate it, in which links corresponding to sequences of events are computed with Markov models. As well as the probability of success, evaluation produces a set of flaws in the candidate plan, which are used by the planner to improve it. We describe a learning method that generates control knowledge compiled from this probabilistic evaluation of plans. The output of the learner is search control knowledge for the planning domain that helps the planner select alternatives that have previously lead to plans with high probability of success. The learned control knowledge is incrementally refined by a combined deductive and inductive mechanism.

Jihie Kim and Paul Rosenbloom. "Learning efficient rules by maintaining the explanation structure". Proceedings of the Thirteenth National Conference on Artificial Intelligence, 1996. (PDF file )

Abstract: Many learning systems suffer from the utility problem; that is, that time after learning is greater than time before learning. Discovering how to assure that learned knowledge will in fact speed up system performance has been a focus of research in explanation-based learning (EBL). One way to analyze the utility problem is by examining the differences between the match process (match search) of the learned rule and the problem-solving proce ss from which it is learned. Prior work along these lines examined one such difference. It showed that if the search-control knowledge used during problem solving is not maintained in the match process for learned rules, then learning can engender a slowdown; but that this slowdown could be eliminated if the match is constrained by the original search-control knowledge. This article examines a second difference --- when the structure of the problem solving differs from the structure of the match process for the learned rules, time after learning can be greater than time before learning. This article also shows that this slowdown can be eliminated by making the learning mechanism sensitive to the problem-solving structure; i.e., by reflecting such structure in the match of the learned rule.

Jihie Kim. "Bounding the Cost of Learned Rules: A Transformational Approach". Ph.D. Thesis, Computer Science Department, School of Engineering, University of Southern California, 1996. Available as ISI Technical Report RR-96-452.

Jaime Carbonell, Oren Etzioni, Yolanda Gil, Robert Joseph, Craig Knoblock, Steven Minton, and Manuela Veloso. "Planning and Learning in PRODIGY: Overview of an Integrated Architecture". In Goal-Driven Learning, Aswin Ram and David Leake (Eds.), MIT Press 1995.

Jihie Kim and Paul Rosenbloom. "A transformational analysis of expensive chunks" and "Mapping explanation-based learning onto Soar: The sequel." In Techinical Report: Transformational analyses of learning in Soar. Information Science Institute and Computer Science Department, University of Southern California, ISI/RR-95-4221, 1995.

Yolanda Gil. "Learning by Experimentation: Incremental Refinement of Incomplete Planning Domains". Proceedings of the Eleventh International Conference on Machine LearningJuly 10-13, 1994, Rutgers, NJ. (PDF file )

Abstract: Building a knowledge base requires iterative refinement to correct imperfections that keep lurking after each new version of the system. This paper concentrates on the automatic refinement of incomplete domain models for planning systems, presenting both a methodology for addressing the problem and empirical results. Planning knowledge may be refined automatically through direct interaction with the environment. Missing conditions cause unreliable predictions of action outcomes. Missing effects cause unreliable predictions of facts about the state. We present a practical approach based on continuous and selective interaction with the environment that pinpoints the type of fault in the domain knowledge that causes any unexpected behavior of the environment, and resorts to experimentation when additional information is needed to correct the fault. Our approach has been implemented in EXPO, a system that uses PRODIGY as a baseline planner and improves its domain knowledge in several domains when initial domain knowledge is up to 50% incomplete. The empirical results presented show that EXPO dramatically improves its prediction accuracy and reduces the amount of unreliable action outcomes.

Yolanda Gil. "Efficient Domain-Independent Experimentation". Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, June 1993. (Postscript file )

Abstract: Planning systems often make the assumption that omniscient world knowledge is available. Our approach makes the more realistic assumption that the initial knowledge about the actions is incomplete, and uses experimentation as a learning mechanism when the missing knowledge causes an execution failure. Previous work on learning by experimentation has not addressed the issue of how to choose good experiments, and much research on learning from failure has relied on background knowledge to build explanations that pinpoint directly the causes of failures. We want to investigate the potential of a system for efficient learning by experimentation without such background knowledge. This paper describes domain-independent heuristics that compare possible hypotheses and choose the ones most likely to cause the failure. These heuristics extract information solely from the domain operators initially available for planning (incapable of producing such explanations) and the planner's experiences in interacting with the environment. Our approach has been implemented in EXPO, a system that uses PRODIGY as a baseline planner and improves its domain knowledge in several domains. The empirical results presented show that EXPO's heuristics dramatically reduce the number of experiments needed to refine incomplete operators.

Yolanda Gil. "Learning New Planning Operators by Exploration and Experimentation". Proceedings of the AAAI Workshop on Learning Action Models, Washington, DC, July 1993. (PDF file )

Abstract: This paper addresses a computational approach to the automated acquisition of domain knowledge for planning systems via experimentation with the environment. Our previous work has shown how existing incomplete operators can be refined by adding missing preconditions and effects. Here we develop additional methods to acquire new operators such as direct analogy with existing operators, decomposition of monolithic operators into meaningful sub-operators, and experimentation with partially-specified operators.

Jihie Kim and Paul Rosenbloom. "Constraing learning with search control". Proceedings of the Tenth International Conference on Machine Learning Amherst, MA, June, 1993. (PDF file )

Yolanda Gil. "Acquiring Domain Knowledge for Planning by Experimentation". Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh PA 15213. August 1992. Available as CMU Technical Report CMU-CS-92-175.

Abstract: In order for autonomous systems to interact with their environment in an intelligent way, they must be given the ability to adapt and learn incrementally and deliberately. It is virtually impossible to devise and hand code all potentially relevant domain knowledge for complex dynamic tasks. This thesis describes a framework to acquire domain knowledge for planning by failure-driven experimentation with the environment. The initial domain knowledge in the system is an approximate model for planning in the environment, defining the system's expectations. The framework exploits the characteristics of planning domains in order to search the space of plausible hypotheses without the need for additional background knowledge to build causal explanations for expectation failures. Plans are executed while the external environment is monitored, and differences between the internal state and external observations are detected by various methods each correlated with a typical cause for the expectation failure. The methods also construct a set of concrete hypotheses to repair the knowledge deficit. After being heuristically filtered, each hypothesis is tested in turn with an experiment. After the experiment is designed, a plan is constructed to achieve the situation required to carry out the experiment. The experiment plan must meet constraints such as minimizing plan length and negative interference with the main goals. The thesis describes a set of domain-independent constraints for experiments and their incorporation in the planning search space. After the execution of the plan and the experiment, observations are collected to conclude if the experiment was successful or not. Upon success, the hypothesis is confirmed and the domain knowledge is adjusted. Upon failure, the experimentation process is iterated on the remaining hypotheses until success or until no more hypotheses are left to be considered. This framework has shown to be an effective way to address incomplete planning knowledge and is demonstrated in a system called EXPO, implemented on the PRODIGY planning architecture. The effectiveness and efficiency of EXPO's methods is empirically demonstrated in several domains, including a large-scale process planning task, where the planner can recover from situations missing up to 50% of domain knowledge through repeated experimentation.

Jaime Carbonell and Yolanda Gil. "Learning by Experimentation: The Operator Refinement Method". Machine Learning: An Artificial Intelligence Approach, Volume III, Michalski, R. S. and Kodratoff, Y. (Eds.), Morgan Kaufmann, 1990. (PDF file )

Abstract: Autonomous systems require the ability to plan effective courses of action under potentially uncertain or unpredictable contingencies. Planning requires knowledge of the environment that is accurate enough to allow reasoning about actions. If the environment is too complex or very dynamic, goal-driven learning with reactive feedback becomes a necessity. This chapter addresses the issue of learning by experimentation as an integral component of PRODIGY. PRODIGY is a flexible planning system that encodes its domain knowledge as declarative operators, and applies the operator refinement method to acquire additional preconditions or postconditions when observed consequences diverge from internal expectations. When multiple explanations for the observed divergence are consistent with the existing domain knowledge, experiments to discriminate among these explanations are generated. The experimentation process isolates the deficient operator and inserts the discriminant condition or unforeseen side-effect to avoid similar impasses in future planning. Thus, experimentation is demand-driven and exploits both the internal state of the planner and any external feedback received. A detailed example of integrated experiment formulation in presented as the basis for a systematic approach to extending an incomplete domain theory or correcting a potentially inaccurate one.

Steve Minton, Jaime G. Carbonell, Craig A. Knoblock, Daniel R. Kuokka, Oren Etzioni, and Yolanda Gil. "Explanation-Based Learning: A Problem-Solving Perspective". Artificial Intelligence, 40(1-3):63-118, September 1989.

Jim Blythe and Tom Mitchell, "On Becoming Reactive", Proceedings of the International Conference on Machine Learning, 1989.

Jim Blythe, "Constraining Search in a Hierarchical, Discriminative Learning System", Proceedings of the European Conference on Artificial Intelligence, 1988.

Yves Kodratoff, Michel Manago and Jim Blythe, "Generalization and Noise", International Journal of Man-Machine Studies, 1987

Jim Blythe, David Needham and Patrick Corsi "An experimental protocol for gathering examples for empirical learning", Proceedings of the First European Workshop on Knowledge Acquisition, 1987