GOOSE: Learning Heuristics for Planning and Search

Dillon Z. Chen

Abstract

Automated planning is a long-standing field in artificial intelligence concerned with finding a sequence of actions to progress an initial state in a given world model into a desired goal state. A dominant approach for solving these problems is heuristic search, where a search algorithm explores the vast space of possible states guided by a heuristic function, h(s), which estimates the cost of reaching the goal from any given state s. The performance of these planners is critically dependent on the quality of this heuristic.

GOOSE (Graphs Optimised fOr Search Evaluation) is a planning system that learns state-of-the-art heuristics from small, easy-to-solve training problems to new, unseen problems with different initial states, goals, and greater number of objects. GOOSE is a cumulation of papers published at AAAI, ICAPS, NeurIPS, and ECAI (see References for more details). It leverages 3 key insights and techniques for learning heuristics with powerful generalisation.

We leverage graph learning for handling inputs of arbitrary size, as is the case for planning problems.
We employ classical, statistical machine learning, which we show theoretically and empirically to be vastly superior than deep learning in terms of sample complexity, training speed, and planning performance
We learn heuristics from ranking states rather than predicting optimal heuristic values

Problem

We are given a set of training planning problems T from the same domain exhibiting a small number of objects N. Our objective is to learn via supervision a heuristic function h(s) that can be used for search to solve new problems that

do not exist in T, and
exhibit a number of objects greater than N, possibly by several orders of magnitude.

Method

GOOSE represents planning problems as graphs, followed by embedding the arbitrarily sized graphs into fixed sized Euclidean vectors, which are finally used to learn a linear heuristic function.

Figure 1 — The GOOSE pipeline.

(1) Graph Representation

GOOSE represents planning problems which consist of an initial state, a goal state, a set of objects, and a set of actions as a graph where nodes encode objects and facts, and edge the relations between objects and facts. Node colours and edge labels are used to encode predicate, goal, and relation information. Notably, actions are not encoded in the graph as their semantics are learned from training. During search, each state can be viewed as a new planning problem with the same goal and actions but a different initial state. Formally, this graph representation is referred to the Instance Learning Graph (ILG) defined in Definition 3.1 of this paper.

(2) Feature Embedding

The resulting graph of each planning problem and state is then embedded into a Euclidean vector via the Weisfeiler-Leman (WL) algorithm. The WL algorithm was originally used as an incomplete graph isomorphism test but has been later used to generate graph kernels and benchmark graph neural network (GNN) expressivity. The WL algorithm iteratively refines node colours by aggregating the colours of neighbouring nodes, and after a fixed number of iterations, the histogram of node colours is used as the graph embedding.

(3) Optimisation

Finally, GOOSE learns a linear heuristic function h(s) = w · f(s) where f(s) is the WL embedding of the ILG of state s and w is a weight vector learned from training. Originally, GOOSE learned w via predicting optimal heuristic values using Gaussian Process Regression and Support Vector Regression. Later, we leverage the insight that heuristic functions in search can be viewed as ranking functions. Later versions of GOOSE learn heuristics framed as a state rankings with Linear Programs and Support Vector Machines.

Results

In our experiments, GOOSE heuristics are trained on planning problems of up to few dozen objects and tested on unseen problems of up to several hundreds of objects. The learned heuristics (WL) are compared against the h^FF domain-independent heuristic and heuristic learned with GNNs in greedy best first search.

Figure 2 — Sizes of benchmark training and testing problems on various planning domains.

WL heuristics outperform deep learning heuristics for training

Notably, WL heuristics are trained in a matter of seconds optimally with CPUs, while GNNs may take many more seconds or minutes to converge to a suboptimal solution on GPUs. In some cases, WL heuristics are over 900 times faster to train than GNN heuristics. Furthermore, WL heuristics exhibit much smaller models and its features are interpretable (cf. page 8 of this paper) in contrast to deep learning models.

Figure 3 — Time to train after data preprocessing in log scale (left;↓) and number of learnable parameters (right;↓) of GNN and WL models on various planning domains.

WL heuristics outperform deep learning heuristics for planning

WL heuristics also outperform GNN heuristic when used for planning and search, again where GNNs also have access to GPUs. The state spaces of planning problems generally grow exponentially with the number of objects, and yet WL heuristics manage to solve almost 20% more problems than GNN heuristics within the same time limit. Furthermore, WL heuristics achieve highly competitive performance against the domain-independent and state-of-the-art h^FF heuristic.

Figure 4 — Cumulative number (↑) of problems solved over time.

References

Below are relevant papers on GOOSE published at major AI and ML conferences in chronological order.

Dillon Z. Chen, Sylvie Thiébaux, and Felipe Trevizan. Learning Domain-Independent Heuristics for Grounded and Lifted Planning. In 38th AAAI Conference on Artificial Intelligence, 2024.
TL;DR: This paper introduced expressivity results for learning domain-independent heuristics using GNNs and introduced the GOOSE framework
Dillon Z. Chen, Felipe Trevizan, and Sylvie Thiébaux. Return to Tradition: Learning Reliable Heuristics with Classical Machine Learning. In 34th International Conference on Automated Planning and Scheduling (ICAPS),, 2024.
TL;DR: This paper introduced the ILG representation and the key insight of using WL embeddings with classical, statistical machine learning for learning powerful heuristics
Dillon Z. Chen and Sylvie Thiébaux. Graph Learning for Numeric Planning. In 38th Conference on Neural Information Processing Systems (NeurIPS), 2024.
TL;DR: This paper extended GOOSE and the WL algorithm for handling numeric planning problems and learning rankings using Linear Programming
Dillon Z. Chen. Weisfeiler-Leman Features for Planning: A 1,000,000 Sample Size Hyperparameter Study. In 28th European Conference on Artificial Intelligence (ECAI), 2025.
TL;DR: This paper studied the effects of various WL algorithm hyperparameters
Mingyu Hao, Dillon Z. Chen, Felipe Trevizan, and Sylvie Thiébaux. Effective Data Generation and Feature Selection in Learning for Planning. In 28th European Conference on Artificial Intelligence (ECAI), 2025.
TL;DR: This paper introduced algorithms for generating better training data at no additional cost and feature pruning techniques.